blob: 02c3c8e504ccb1a924d4d33b2dc2da22ba9aad89 [file] [log] [blame]
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00001///
SiCong Li6d8b94a2019-11-21 18:22:38 +00002/// Copyright (c) 2017-2019 ARM Limited.
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00003///
4/// SPDX-License-Identifier: MIT
5///
6/// Permission is hereby granted, free of charge, to any person obtaining a copy
7/// of this software and associated documentation files (the "Software"), to
8/// deal in the Software without restriction, including without limitation the
9/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
10/// sell copies of the Software, and to permit persons to whom the Software is
11/// furnished to do so, subject to the following conditions:
12///
13/// The above copyright notice and this permission notice shall be included in all
14/// copies or substantial portions of the Software.
15///
16/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22/// SOFTWARE.
23///
Moritz Pflanzere3e73452017-08-11 15:40:16 +010024namespace arm_compute
25{
26namespace test
27{
Anthony Barbier6ff3b192017-09-04 18:44:23 +010028/**
Anthony Barbier79c61782017-06-23 11:48:24 +010029@page tests Validation and benchmarks tests
Anthony Barbier6ff3b192017-09-04 18:44:23 +010030
31@tableofcontents
32
Moritz Pflanzere3e73452017-08-11 15:40:16 +010033@section tests_overview Overview
34
35Benchmark and validation tests are based on the same framework to setup and run
36the tests. In addition to running simple, self-contained test functions the
37framework supports fixtures and data test cases. The former allows to share
38common setup routines between various backends thus reducing the amount of
39duplicated code. The latter can be used to parameterize tests or fixtures with
40different inputs, e.g. different tensor shapes. One limitation is that
41tests/fixtures cannot be parameterized based on the data type if static type
42information is needed within the test (e.g. to validate the results).
43
Anthony Barbiercc0a80b2017-12-15 11:37:29 +000044@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
45
46@note Tests are not included in the pre-built binary archive, you have to build them from sources.
47
Moritz Pflanzere3e73452017-08-11 15:40:16 +010048@subsection tests_overview_structure Directory structure
49
50 .
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010051 `-- tests <- Top level test directory. All files in here are shared among validation and benchmark.
52 |-- framework <- Underlying test framework.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010053 |-- CL \
54 |-- NEON -> Backend specific files with helper functions etc.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010055 |-- benchmark <- Top level directory for the benchmarking files.
56 | |-- fixtures <- Fixtures for benchmark tests.
57 | |-- CL <- OpenCL backend test cases on a function level.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010058 | | `-- SYSTEM <- OpenCL system tests, e.g. whole networks
59 | `-- NEON <- Same for NEON
60 | `-- SYSTEM
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010061 |-- datasets <- Datasets for benchmark and validation tests.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010062 |-- main.cpp <- Main entry point for the tests. Currently shared between validation and benchmarking.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010063 |-- networks <- Network classes for system level tests.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010064 `-- validation -> Top level directory for validation files.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010065 |-- CPP -> C++ reference code
66 |-- CL \
67 |-- NEON -> Backend specific test cases
Moritz Pflanzere3e73452017-08-11 15:40:16 +010068 `-- fixtures -> Fixtures shared among all backends. Used to setup target function and tensors.
69
70@subsection tests_overview_fixtures Fixtures
71
72Fixtures can be used to share common setup, teardown or even run tasks among
73multiple test cases. For that purpose a fixture can define a `setup`,
74`teardown` and `run` method. Additionally the constructor and destructor might
75also be customized.
76
77An instance of the fixture is created immediately before the actual test is
78executed. After construction the @ref framework::Fixture::setup method is called. Then the test
79function or the fixtures `run` method is invoked. After test execution the
80@ref framework::Fixture::teardown method is called and lastly the fixture is destructed.
81
82@subsubsection tests_overview_fixtures_fixture Fixture
83
84Fixtures for non-parameterized test are straightforward. The custom fixture
85class has to inherit from @ref framework::Fixture and choose to implement any of the
86`setup`, `teardown` or `run` methods. None of the methods takes any arguments
87or returns anything.
88
89 class CustomFixture : public framework::Fixture
90 {
91 void setup()
92 {
93 _ptr = malloc(4000);
94 }
95
96 void run()
97 {
98 ARM_COMPUTE_ASSERT(_ptr != nullptr);
99 }
100
101 void teardown()
102 {
103 free(_ptr);
104 }
105
106 void *_ptr;
107 };
108
109@subsubsection tests_overview_fixtures_data_fixture Data fixture
110
111The advantage of a parameterized fixture is that arguments can be passed to the setup method at runtime. To make this possible the setup method has to be a template with a type parameter for every argument (though the template parameter doesn't have to be used). All other methods remain the same.
112
113 class CustomFixture : public framework::Fixture
114 {
115 #ifdef ALTERNATIVE_DECLARATION
116 template <typename ...>
117 void setup(size_t size)
118 {
119 _ptr = malloc(size);
120 }
121 #else
122 template <typename T>
123 void setup(T size)
124 {
125 _ptr = malloc(size);
126 }
127 #endif
128
129 void run()
130 {
131 ARM_COMPUTE_ASSERT(_ptr != nullptr);
132 }
133
134 void teardown()
135 {
136 free(_ptr);
137 }
138
139 void *_ptr;
140 };
141
142@subsection tests_overview_test_cases Test cases
143
144All following commands can be optionally prefixed with `EXPECTED_FAILURE_` or
145`DISABLED_`.
146
147@subsubsection tests_overview_test_cases_test_case Test case
148
149A simple test case function taking no inputs and having no (shared) state.
150
151- First argument is the name of the test case (has to be unique within the
152 enclosing test suite).
153- Second argument is the dataset mode in which the test will be active.
154
155
156 TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT)
157 {
158 ARM_COMPUTE_ASSERT_EQUAL(1 + 1, 2);
159 }
160
161@subsubsection tests_overview_test_cases_fixture_fixture_test_case Fixture test case
162
163A simple test case function taking no inputs that inherits from a fixture. The
164test case will have access to all public and protected members of the fixture.
165Only the setup and teardown methods of the fixture will be used. The body of
166this function will be used as test function.
167
168- First argument is the name of the test case (has to be unique within the
169 enclosing test suite).
170- Second argument is the class name of the fixture.
171- Third argument is the dataset mode in which the test will be active.
172
173
174 class FixtureName : public framework::Fixture
175 {
176 public:
177 void setup() override
178 {
179 _one = 1;
180 }
181
182 protected:
183 int _one;
184 };
185
186 FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT)
187 {
188 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
189 }
190
191@subsubsection tests_overview_test_cases_fixture_register_fixture_test_case Registering a fixture as test case
192
193Allows to use a fixture directly as test case. Instead of defining a new test
194function the run method of the fixture will be executed.
195
196- First argument is the name of the test case (has to be unique within the
197 enclosing test suite).
198- Second argument is the class name of the fixture.
199- Third argument is the dataset mode in which the test will be active.
200
201
202 class FixtureName : public framework::Fixture
203 {
204 public:
205 void setup() override
206 {
207 _one = 1;
208 }
209
210 void run() override
211 {
212 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
213 }
214
215 protected:
216 int _one;
217 };
218
219 REGISTER_FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT);
220
221
222@subsubsection tests_overview_test_cases_data_test_case Data test case
223
224A parameterized test case function that has no (shared) state. The dataset will
225be used to generate versions of the test case with different inputs.
226
227- First argument is the name of the test case (has to be unique within the
228 enclosing test suite).
229- Second argument is the dataset mode in which the test will be active.
230- Third argument is the dataset.
231- Further arguments specify names of the arguments to the test function. The
232 number must match the arity of the dataset.
233
234
235 DATA_TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}), num)
236 {
237 ARM_COMPUTE_ASSERT(num < 4);
238 }
239
240@subsubsection tests_overview_test_cases_fixture_data_test_case Fixture data test case
241
242A parameterized test case that inherits from a fixture. The test case will have
243access to all public and protected members of the fixture. Only the setup and
244teardown methods of the fixture will be used. The setup method of the fixture
245needs to be a template and has to accept inputs from the dataset as arguments.
246The body of this function will be used as test function. The dataset will be
247used to generate versions of the test case with different inputs.
248
249- First argument is the name of the test case (has to be unique within the
250 enclosing test suite).
251- Second argument is the class name of the fixture.
252- Third argument is the dataset mode in which the test will be active.
253- Fourth argument is the dataset.
254
255
256 class FixtureName : public framework::Fixture
257 {
258 public:
259 template <typename T>
260 void setup(T num)
261 {
262 _num = num;
263 }
264
265 protected:
266 int _num;
267 };
268
269 FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}))
270 {
271 ARM_COMPUTE_ASSERT(_num < 4);
272 }
273
274@subsubsection tests_overview_test_cases_register_fixture_data_test_case Registering a fixture as data test case
275
276Allows to use a fixture directly as parameterized test case. Instead of
277defining a new test function the run method of the fixture will be executed.
278The setup method of the fixture needs to be a template and has to accept inputs
279from the dataset as arguments. The dataset will be used to generate versions of
280the test case with different inputs.
281
282- First argument is the name of the test case (has to be unique within the
283 enclosing test suite).
284- Second argument is the class name of the fixture.
285- Third argument is the dataset mode in which the test will be active.
286- Fourth argument is the dataset.
287
288
289 class FixtureName : public framework::Fixture
290 {
291 public:
292 template <typename T>
293 void setup(T num)
294 {
295 _num = num;
296 }
297
298 void run() override
299 {
300 ARM_COMPUTE_ASSERT(_num < 4);
301 }
302
303 protected:
304 int _num;
305 };
306
307 REGISTER_FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}));
308
309@section writing_tests Writing validation tests
310
311Before starting a new test case have a look at the existing ones. They should
312provide a good overview how test cases are structured.
313
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100314- The C++ reference needs to be added to `tests/validation/CPP/`. The
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100315 reference function is typically a template parameterized by the underlying
316 value type of the `SimpleTensor`. This makes it easy to specialise for
317 different data types.
318- If all backends have a common interface it makes sense to share the setup
319 code. This can be done by adding a fixture in
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100320 `tests/validation/fixtures/`. Inside of the `setup` method of a fixture
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100321 the tensors can be created and initialised and the function can be configured
322 and run. The actual test will only have to validate the results. To be shared
323 among multiple backends the fixture class is usually a template that accepts
324 the specific types (data, tensor class, function class etc.) as parameters.
325- The actual test cases need to be added for each backend individually.
326 Typically the will be multiple tests for different data types and for
327 different execution modes, e.g. precommit and nightly.
328
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100329@section tests_running_tests Running tests
Anthony Barbier38e7f1f2018-05-21 13:37:47 +0100330@subsection tests_running_tests_benchmark_and_validation Benchmarking and validation suites
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100331@subsubsection tests_running_tests_benchmarking_filter Filter tests
332All tests can be run by invoking
333
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100334 ./arm_compute_benchmark ./data
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100335
336where `./data` contains the assets needed by the tests.
337
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100338If only a subset of the tests has to be executed the `--filter` option takes a
339regular expression to select matching tests.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100340
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000341 ./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
342
343@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100344
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100345Additionally each test has a test id which can be used as a filter, too.
346However, the test id is not guaranteed to be stable when new tests are added.
347Only for a specific build the same the test will keep its id.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100348
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100349 ./arm_compute_benchmark --filter-id=10 ./data
350
351All available tests can be displayed with the `--list-tests` switch.
352
353 ./arm_compute_benchmark --list-tests
354
355More options can be found in the `--help` message.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100356
357@subsubsection tests_running_tests_benchmarking_runtime Runtime
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100358By default every test is run once on a single thread. The number of iterations
359can be controlled via the `--iterations` option and the number of threads via
360`--threads`.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100361
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100362@subsubsection tests_running_tests_benchmarking_output Output
363By default the benchmarking results are printed in a human readable format on
364the command line. The colored output can be disabled via `--no-color-output`.
365As an alternative output format JSON is supported and can be selected via
366`--log-format=json`. To write the output to a file instead of stdout the
367`--log-file` option can be used.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100368
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100369@subsubsection tests_running_tests_benchmarking_mode Mode
370Tests contain different datasets of different sizes, some of which will take several hours to run.
371You can select which datasets to use by using the `--mode` option, we recommed you use `--mode=precommit` to start with.
372
373@subsubsection tests_running_tests_benchmarking_instruments Instruments
374You can use the `--instruments` option to select one or more instruments to measure the execution time of the benchmark tests.
375
376`PMU` will try to read the CPU PMU events from the kernel (They need to be enabled on your platform)
377
378`MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
379
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000380`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100381
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000382You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100383
384@note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
385
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000386@subsubsection tests_running_examples Examples
387
388To run all the precommit validation tests:
389
390 LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
391
392To run the OpenCL precommit validation tests:
393
394 LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
395
396To run the NEON precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
397
398 LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
399
400To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
401
402 LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
403
Georgios Pinitas58216322020-02-26 11:13:13 +0000404@note You might need to export the path to OpenCL library as well in your LD_LIBRARY_PATH if Compute Library was built with OpenCL enabled.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100405*/
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100406} // namespace test
407} // namespace arm_compute