blob: 5fcb4ba1b7657ee515ca18828186ed51e6e4cb40 [file] [log] [blame]
Moritz Pflanzere3e73452017-08-11 15:40:16 +01001namespace arm_compute
2{
3namespace test
4{
Anthony Barbier6ff3b192017-09-04 18:44:23 +01005/**
Anthony Barbier79c61782017-06-23 11:48:24 +01006@page tests Validation and benchmarks tests
Anthony Barbier6ff3b192017-09-04 18:44:23 +01007
8@tableofcontents
9
Moritz Pflanzere3e73452017-08-11 15:40:16 +010010@section tests_overview Overview
11
12Benchmark and validation tests are based on the same framework to setup and run
13the tests. In addition to running simple, self-contained test functions the
14framework supports fixtures and data test cases. The former allows to share
15common setup routines between various backends thus reducing the amount of
16duplicated code. The latter can be used to parameterize tests or fixtures with
17different inputs, e.g. different tensor shapes. One limitation is that
18tests/fixtures cannot be parameterized based on the data type if static type
19information is needed within the test (e.g. to validate the results).
20
Anthony Barbiercc0a80b2017-12-15 11:37:29 +000021@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
22
23@note Tests are not included in the pre-built binary archive, you have to build them from sources.
24
Moritz Pflanzere3e73452017-08-11 15:40:16 +010025@subsection tests_overview_structure Directory structure
26
27 .
28 |-- computer_vision <- Legacy tests. No new test must be added. <!-- FIXME: Remove before release -->
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010029 `-- tests <- Top level test directory. All files in here are shared among validation and benchmark.
30 |-- framework <- Underlying test framework.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010031 |-- CL \
32 |-- NEON -> Backend specific files with helper functions etc.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010033 |-- VX / <!-- FIXME: Remove VX -->
34 |-- benchmark <- Top level directory for the benchmarking files.
35 | |-- fixtures <- Fixtures for benchmark tests.
36 | |-- CL <- OpenCL backend test cases on a function level.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010037 | | `-- SYSTEM <- OpenCL system tests, e.g. whole networks
38 | `-- NEON <- Same for NEON
39 | `-- SYSTEM
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010040 |-- datasets <- Datasets for benchmark and validation tests.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010041 |-- main.cpp <- Main entry point for the tests. Currently shared between validation and benchmarking.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010042 |-- networks <- Network classes for system level tests.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010043 `-- validation -> Top level directory for validation files.
Moritz Pflanzere3e73452017-08-11 15:40:16 +010044 |-- CPP -> C++ reference code
45 |-- CL \
46 |-- NEON -> Backend specific test cases
Moritz Pflanzera09de0c2017-09-01 20:41:12 +010047 |-- VX / <!-- FIXME: Remove VX -->
Moritz Pflanzere3e73452017-08-11 15:40:16 +010048 `-- fixtures -> Fixtures shared among all backends. Used to setup target function and tensors.
49
50@subsection tests_overview_fixtures Fixtures
51
52Fixtures can be used to share common setup, teardown or even run tasks among
53multiple test cases. For that purpose a fixture can define a `setup`,
54`teardown` and `run` method. Additionally the constructor and destructor might
55also be customized.
56
57An instance of the fixture is created immediately before the actual test is
58executed. After construction the @ref framework::Fixture::setup method is called. Then the test
59function or the fixtures `run` method is invoked. After test execution the
60@ref framework::Fixture::teardown method is called and lastly the fixture is destructed.
61
62@subsubsection tests_overview_fixtures_fixture Fixture
63
64Fixtures for non-parameterized test are straightforward. The custom fixture
65class has to inherit from @ref framework::Fixture and choose to implement any of the
66`setup`, `teardown` or `run` methods. None of the methods takes any arguments
67or returns anything.
68
69 class CustomFixture : public framework::Fixture
70 {
71 void setup()
72 {
73 _ptr = malloc(4000);
74 }
75
76 void run()
77 {
78 ARM_COMPUTE_ASSERT(_ptr != nullptr);
79 }
80
81 void teardown()
82 {
83 free(_ptr);
84 }
85
86 void *_ptr;
87 };
88
89@subsubsection tests_overview_fixtures_data_fixture Data fixture
90
91The advantage of a parameterized fixture is that arguments can be passed to the setup method at runtime. To make this possible the setup method has to be a template with a type parameter for every argument (though the template parameter doesn't have to be used). All other methods remain the same.
92
93 class CustomFixture : public framework::Fixture
94 {
95 #ifdef ALTERNATIVE_DECLARATION
96 template <typename ...>
97 void setup(size_t size)
98 {
99 _ptr = malloc(size);
100 }
101 #else
102 template <typename T>
103 void setup(T size)
104 {
105 _ptr = malloc(size);
106 }
107 #endif
108
109 void run()
110 {
111 ARM_COMPUTE_ASSERT(_ptr != nullptr);
112 }
113
114 void teardown()
115 {
116 free(_ptr);
117 }
118
119 void *_ptr;
120 };
121
122@subsection tests_overview_test_cases Test cases
123
124All following commands can be optionally prefixed with `EXPECTED_FAILURE_` or
125`DISABLED_`.
126
127@subsubsection tests_overview_test_cases_test_case Test case
128
129A simple test case function taking no inputs and having no (shared) state.
130
131- First argument is the name of the test case (has to be unique within the
132 enclosing test suite).
133- Second argument is the dataset mode in which the test will be active.
134
135
136 TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT)
137 {
138 ARM_COMPUTE_ASSERT_EQUAL(1 + 1, 2);
139 }
140
141@subsubsection tests_overview_test_cases_fixture_fixture_test_case Fixture test case
142
143A simple test case function taking no inputs that inherits from a fixture. The
144test case will have access to all public and protected members of the fixture.
145Only the setup and teardown methods of the fixture will be used. The body of
146this function will be used as test function.
147
148- First argument is the name of the test case (has to be unique within the
149 enclosing test suite).
150- Second argument is the class name of the fixture.
151- Third argument is the dataset mode in which the test will be active.
152
153
154 class FixtureName : public framework::Fixture
155 {
156 public:
157 void setup() override
158 {
159 _one = 1;
160 }
161
162 protected:
163 int _one;
164 };
165
166 FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT)
167 {
168 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
169 }
170
171@subsubsection tests_overview_test_cases_fixture_register_fixture_test_case Registering a fixture as test case
172
173Allows to use a fixture directly as test case. Instead of defining a new test
174function the run method of the fixture will be executed.
175
176- First argument is the name of the test case (has to be unique within the
177 enclosing test suite).
178- Second argument is the class name of the fixture.
179- Third argument is the dataset mode in which the test will be active.
180
181
182 class FixtureName : public framework::Fixture
183 {
184 public:
185 void setup() override
186 {
187 _one = 1;
188 }
189
190 void run() override
191 {
192 ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
193 }
194
195 protected:
196 int _one;
197 };
198
199 REGISTER_FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT);
200
201
202@subsubsection tests_overview_test_cases_data_test_case Data test case
203
204A parameterized test case function that has no (shared) state. The dataset will
205be used to generate versions of the test case with different inputs.
206
207- First argument is the name of the test case (has to be unique within the
208 enclosing test suite).
209- Second argument is the dataset mode in which the test will be active.
210- Third argument is the dataset.
211- Further arguments specify names of the arguments to the test function. The
212 number must match the arity of the dataset.
213
214
215 DATA_TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}), num)
216 {
217 ARM_COMPUTE_ASSERT(num < 4);
218 }
219
220@subsubsection tests_overview_test_cases_fixture_data_test_case Fixture data test case
221
222A parameterized test case that inherits from a fixture. The test case will have
223access to all public and protected members of the fixture. Only the setup and
224teardown methods of the fixture will be used. The setup method of the fixture
225needs to be a template and has to accept inputs from the dataset as arguments.
226The body of this function will be used as test function. The dataset will be
227used to generate versions of the test case with different inputs.
228
229- First argument is the name of the test case (has to be unique within the
230 enclosing test suite).
231- Second argument is the class name of the fixture.
232- Third argument is the dataset mode in which the test will be active.
233- Fourth argument is the dataset.
234
235
236 class FixtureName : public framework::Fixture
237 {
238 public:
239 template <typename T>
240 void setup(T num)
241 {
242 _num = num;
243 }
244
245 protected:
246 int _num;
247 };
248
249 FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}))
250 {
251 ARM_COMPUTE_ASSERT(_num < 4);
252 }
253
254@subsubsection tests_overview_test_cases_register_fixture_data_test_case Registering a fixture as data test case
255
256Allows to use a fixture directly as parameterized test case. Instead of
257defining a new test function the run method of the fixture will be executed.
258The setup method of the fixture needs to be a template and has to accept inputs
259from the dataset as arguments. The dataset will be used to generate versions of
260the test case with different inputs.
261
262- First argument is the name of the test case (has to be unique within the
263 enclosing test suite).
264- Second argument is the class name of the fixture.
265- Third argument is the dataset mode in which the test will be active.
266- Fourth argument is the dataset.
267
268
269 class FixtureName : public framework::Fixture
270 {
271 public:
272 template <typename T>
273 void setup(T num)
274 {
275 _num = num;
276 }
277
278 void run() override
279 {
280 ARM_COMPUTE_ASSERT(_num < 4);
281 }
282
283 protected:
284 int _num;
285 };
286
287 REGISTER_FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}));
288
289@section writing_tests Writing validation tests
290
291Before starting a new test case have a look at the existing ones. They should
292provide a good overview how test cases are structured.
293
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100294- The C++ reference needs to be added to `tests/validation/CPP/`. The
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100295 reference function is typically a template parameterized by the underlying
296 value type of the `SimpleTensor`. This makes it easy to specialise for
297 different data types.
298- If all backends have a common interface it makes sense to share the setup
299 code. This can be done by adding a fixture in
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100300 `tests/validation/fixtures/`. Inside of the `setup` method of a fixture
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100301 the tensors can be created and initialised and the function can be configured
302 and run. The actual test will only have to validate the results. To be shared
303 among multiple backends the fixture class is usually a template that accepts
304 the specific types (data, tensor class, function class etc.) as parameters.
305- The actual test cases need to be added for each backend individually.
306 Typically the will be multiple tests for different data types and for
307 different execution modes, e.g. precommit and nightly.
308
Moritz Pflanzera09de0c2017-09-01 20:41:12 +0100309<!-- FIXME: Remove before release -->
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100310@section building_test_dependencies Building dependencies
311
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100312@note Only required when tests from the old validation framework need to be run.
313
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100314The tests currently make use of Boost (Test and Program options) for
315validation. Below are instructions about how to build these 3rd party
316libraries.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100317
Anthony Barbier79c61782017-06-23 11:48:24 +0100318@note By default the build of the validation and benchmark tests is disabled, to enable it use `validation_tests=1` and `benchmark_tests=1`
319
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100320@subsection building_boost Building Boost
321
322First follow the instructions from the Boost library on how to setup the Boost
323build system
324(http://www.boost.org/doc/libs/1_64_0/more/getting_started/index.html).
325Afterwards the required libraries can be build with:
326
327 ./b2 --with-program_options --with-test link=static \
328 define=BOOST_TEST_ALTERNATIVE_INIT_API
329
330Additionally, depending on your environment, it might be necessary to specify
331the ```toolset=``` option to choose the right compiler. Moreover,
332```address-model=32``` can be used to force building for 32bit and
333```target-os=android``` must be specified to build for Android.
334
335After executing the build command the libraries
336```libboost_program_options.a``` and ```libboost_unit_test_framework.a``` can
337be found in ```./stage/lib```.
Moritz Pflanzera09de0c2017-09-01 20:41:12 +0100338<!-- FIXME: end remove -->
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100339
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100340@section tests_running_tests Running tests
341@subsection tests_running_tests_benchmarking Benchmarking
342@subsubsection tests_running_tests_benchmarking_filter Filter tests
343All tests can be run by invoking
344
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100345 ./arm_compute_benchmark ./data
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100346
347where `./data` contains the assets needed by the tests.
348
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100349If only a subset of the tests has to be executed the `--filter` option takes a
350regular expression to select matching tests.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100351
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000352 ./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
353
354@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100355
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100356Additionally each test has a test id which can be used as a filter, too.
357However, the test id is not guaranteed to be stable when new tests are added.
358Only for a specific build the same the test will keep its id.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100359
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100360 ./arm_compute_benchmark --filter-id=10 ./data
361
362All available tests can be displayed with the `--list-tests` switch.
363
364 ./arm_compute_benchmark --list-tests
365
366More options can be found in the `--help` message.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100367
368@subsubsection tests_running_tests_benchmarking_runtime Runtime
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100369By default every test is run once on a single thread. The number of iterations
370can be controlled via the `--iterations` option and the number of threads via
371`--threads`.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100372
Moritz Pflanzer2b26b852017-07-21 10:09:30 +0100373@subsubsection tests_running_tests_benchmarking_output Output
374By default the benchmarking results are printed in a human readable format on
375the command line. The colored output can be disabled via `--no-color-output`.
376As an alternative output format JSON is supported and can be selected via
377`--log-format=json`. To write the output to a file instead of stdout the
378`--log-file` option can be used.
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100379
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100380@subsubsection tests_running_tests_benchmarking_mode Mode
381Tests contain different datasets of different sizes, some of which will take several hours to run.
382You can select which datasets to use by using the `--mode` option, we recommed you use `--mode=precommit` to start with.
383
384@subsubsection tests_running_tests_benchmarking_instruments Instruments
385You can use the `--instruments` option to select one or more instruments to measure the execution time of the benchmark tests.
386
387`PMU` will try to read the CPU PMU events from the kernel (They need to be enabled on your platform)
388
389`MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
390
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000391`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100392
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000393You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100394
395@note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
396
Anthony Barbiercc0a80b2017-12-15 11:37:29 +0000397@subsubsection tests_running_examples Examples
398
399To run all the precommit validation tests:
400
401 LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
402
403To run the OpenCL precommit validation tests:
404
405 LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
406
407To run the NEON precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
408
409 LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
410
411To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
412
413 LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
414
Moritz Pflanzera09de0c2017-09-01 20:41:12 +0100415<!-- FIXME: Remove before release and change above to benchmark and validation -->
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100416@subsection tests_running_tests_validation Validation
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100417
418@note The new validation tests have the same interface as the benchmarking tests.
419
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100420@subsubsection tests_running_tests_validation_filter Filter tests
421All tests can be run by invoking
422
423 ./arm_compute_validation -- ./data
424
425where `./data` contains the assets needed by the tests.
426
427As running all tests can take a lot of time the suite is split into "precommit" and "nightly" tests. The precommit tests will be fast to execute but still cover the most important features. In contrast the nightly tests offer more extensive coverage but take longer. The different subsets can be selected from the command line as follows:
428
429 ./arm_compute_validation -t @precommit -- ./data
430 ./arm_compute_validation -t @nightly -- ./data
431
432Additionally it is possible to select specific suites or tests:
433
434 ./arm_compute_validation -t CL -- ./data
435 ./arm_compute_validation -t NEON/BitwiseAnd/RunSmall/_0 -- ./data
436
437All available tests can be displayed with the `--list_content` switch.
438
439 ./arm_compute_validation --list_content -- ./data
440
441For a complete list of possible selectors please see: http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/runtime_config/test_unit_filtering.html
442
443@subsubsection tests_running_tests_validation_verbosity Verbosity
444There are two separate flags to control the verbosity of the test output. `--report_level` controls the verbosity of the summary produced after all tests have been executed. `--log_level` controls the verbosity of the information generated during the execution of tests. All available settings can be found in the Boost documentation for [--report_level](http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/utf_reference/rt_param_reference/report_level.html) and [--log_level](http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/utf_reference/rt_param_reference/log_level.html), respectively.
Anthony Barbier144d2ff2017-09-29 10:46:08 +0100445<!-- FIXME: end remove -->
Anthony Barbier6ff3b192017-09-04 18:44:23 +0100446*/
Moritz Pflanzere3e73452017-08-11 15:40:16 +0100447} // namespace test
448} // namespace arm_compute