Blame - docs/02_tests.dox - ml/ComputeLibrary

blob: 5fcb4ba1b7657ee515ca18828186ed51e6e4cb40 [file] [log] [blame]

Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	1	namespace arm_compute
				2	{
				3	namespace test
				4	{
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	5	/**
Anthony Barbier	79c6178	2017-06-23 11:48:24 +0100	[diff] [blame]	6	@page tests Validation and benchmarks tests
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	7
				8	@tableofcontents
				9
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	10	@section tests_overview Overview
				11
				12	Benchmark and validation tests are based on the same framework to setup and run
				13	the tests. In addition to running simple, self-contained test functions the
				14	framework supports fixtures and data test cases. The former allows to share
				15	common setup routines between various backends thus reducing the amount of
				16	duplicated code. The latter can be used to parameterize tests or fixtures with
				17	different inputs, e.g. different tensor shapes. One limitation is that
				18	tests/fixtures cannot be parameterized based on the data type if static type
				19	information is needed within the test (e.g. to validate the results).
				20
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame^]	21	@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
				22
				23	@note Tests are not included in the pre-built binary archive, you have to build them from sources.
				24
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	25	@subsection tests_overview_structure Directory structure
				26
				27	.
				28	\|-- computer_vision <- Legacy tests. No new test must be added. <!-- FIXME: Remove before release -->
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	29	`-- tests <- Top level test directory. All files in here are shared among validation and benchmark.
				30	\|-- framework <- Underlying test framework.
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	31	\|-- CL \
				32	\|-- NEON -> Backend specific files with helper functions etc.
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	33	\|-- VX / <!-- FIXME: Remove VX -->
				34	\|-- benchmark <- Top level directory for the benchmarking files.
				35	\| \|-- fixtures <- Fixtures for benchmark tests.
				36	\| \|-- CL <- OpenCL backend test cases on a function level.
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	37	\| \| `-- SYSTEM <- OpenCL system tests, e.g. whole networks
				38	\| `-- NEON <- Same for NEON
				39	\| `-- SYSTEM
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	40	\|-- datasets <- Datasets for benchmark and validation tests.
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	41	\|-- main.cpp <- Main entry point for the tests. Currently shared between validation and benchmarking.
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	42	\|-- networks <- Network classes for system level tests.
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	43	`-- validation -> Top level directory for validation files.
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	44	\|-- CPP -> C++ reference code
				45	\|-- CL \
				46	\|-- NEON -> Backend specific test cases
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	47	\|-- VX / <!-- FIXME: Remove VX -->
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	48	`-- fixtures -> Fixtures shared among all backends. Used to setup target function and tensors.
				49
				50	@subsection tests_overview_fixtures Fixtures
				51
				52	Fixtures can be used to share common setup, teardown or even run tasks among
				53	multiple test cases. For that purpose a fixture can define a `setup`,
				54	`teardown` and `run` method. Additionally the constructor and destructor might
				55	also be customized.
				56
				57	An instance of the fixture is created immediately before the actual test is
				58	executed. After construction the @ref framework::Fixture::setup method is called. Then the test
				59	function or the fixtures `run` method is invoked. After test execution the
				60	@ref framework::Fixture::teardown method is called and lastly the fixture is destructed.
				61
				62	@subsubsection tests_overview_fixtures_fixture Fixture
				63
				64	Fixtures for non-parameterized test are straightforward. The custom fixture
				65	class has to inherit from @ref framework::Fixture and choose to implement any of the
				66	`setup`, `teardown` or `run` methods. None of the methods takes any arguments
				67	or returns anything.
				68
				69	class CustomFixture : public framework::Fixture
				70	{
				71	void setup()
				72	{
				73	_ptr = malloc(4000);
				74	}
				75
				76	void run()
				77	{
				78	ARM_COMPUTE_ASSERT(_ptr != nullptr);
				79	}
				80
				81	void teardown()
				82	{
				83	free(_ptr);
				84	}
				85
				86	void *_ptr;
				87	};
				88
				89	@subsubsection tests_overview_fixtures_data_fixture Data fixture
				90
				91	The advantage of a parameterized fixture is that arguments can be passed to the setup method at runtime. To make this possible the setup method has to be a template with a type parameter for every argument (though the template parameter doesn't have to be used). All other methods remain the same.
				92
				93	class CustomFixture : public framework::Fixture
				94	{
				95	#ifdef ALTERNATIVE_DECLARATION
				96	template <typename ...>
				97	void setup(size_t size)
				98	{
				99	_ptr = malloc(size);
				100	}
				101	#else
				102	template <typename T>
				103	void setup(T size)
				104	{
				105	_ptr = malloc(size);
				106	}
				107	#endif
				108
				109	void run()
				110	{
				111	ARM_COMPUTE_ASSERT(_ptr != nullptr);
				112	}
				113
				114	void teardown()
				115	{
				116	free(_ptr);
				117	}
				118
				119	void *_ptr;
				120	};
				121
				122	@subsection tests_overview_test_cases Test cases
				123
				124	All following commands can be optionally prefixed with `EXPECTED_FAILURE_` or
				125	`DISABLED_`.
				126
				127	@subsubsection tests_overview_test_cases_test_case Test case
				128
				129	A simple test case function taking no inputs and having no (shared) state.
				130
				131	- First argument is the name of the test case (has to be unique within the
				132	enclosing test suite).
				133	- Second argument is the dataset mode in which the test will be active.
				134
				135
				136	TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT)
				137	{
				138	ARM_COMPUTE_ASSERT_EQUAL(1 + 1, 2);
				139	}
				140
				141	@subsubsection tests_overview_test_cases_fixture_fixture_test_case Fixture test case
				142
				143	A simple test case function taking no inputs that inherits from a fixture. The
				144	test case will have access to all public and protected members of the fixture.
				145	Only the setup and teardown methods of the fixture will be used. The body of
				146	this function will be used as test function.
				147
				148	- First argument is the name of the test case (has to be unique within the
				149	enclosing test suite).
				150	- Second argument is the class name of the fixture.
				151	- Third argument is the dataset mode in which the test will be active.
				152
				153
				154	class FixtureName : public framework::Fixture
				155	{
				156	public:
				157	void setup() override
				158	{
				159	_one = 1;
				160	}
				161
				162	protected:
				163	int _one;
				164	};
				165
				166	FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT)
				167	{
				168	ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
				169	}
				170
				171	@subsubsection tests_overview_test_cases_fixture_register_fixture_test_case Registering a fixture as test case
				172
				173	Allows to use a fixture directly as test case. Instead of defining a new test
				174	function the run method of the fixture will be executed.
				175
				176	- First argument is the name of the test case (has to be unique within the
				177	enclosing test suite).
				178	- Second argument is the class name of the fixture.
				179	- Third argument is the dataset mode in which the test will be active.
				180
				181
				182	class FixtureName : public framework::Fixture
				183	{
				184	public:
				185	void setup() override
				186	{
				187	_one = 1;
				188	}
				189
				190	void run() override
				191	{
				192	ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
				193	}
				194
				195	protected:
				196	int _one;
				197	};
				198
				199	REGISTER_FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT);
				200
				201
				202	@subsubsection tests_overview_test_cases_data_test_case Data test case
				203
				204	A parameterized test case function that has no (shared) state. The dataset will
				205	be used to generate versions of the test case with different inputs.
				206
				207	- First argument is the name of the test case (has to be unique within the
				208	enclosing test suite).
				209	- Second argument is the dataset mode in which the test will be active.
				210	- Third argument is the dataset.
				211	- Further arguments specify names of the arguments to the test function. The
				212	number must match the arity of the dataset.
				213
				214
				215	DATA_TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}), num)
				216	{
				217	ARM_COMPUTE_ASSERT(num < 4);
				218	}
				219
				220	@subsubsection tests_overview_test_cases_fixture_data_test_case Fixture data test case
				221
				222	A parameterized test case that inherits from a fixture. The test case will have
				223	access to all public and protected members of the fixture. Only the setup and
				224	teardown methods of the fixture will be used. The setup method of the fixture
				225	needs to be a template and has to accept inputs from the dataset as arguments.
				226	The body of this function will be used as test function. The dataset will be
				227	used to generate versions of the test case with different inputs.
				228
				229	- First argument is the name of the test case (has to be unique within the
				230	enclosing test suite).
				231	- Second argument is the class name of the fixture.
				232	- Third argument is the dataset mode in which the test will be active.
				233	- Fourth argument is the dataset.
				234
				235
				236	class FixtureName : public framework::Fixture
				237	{
				238	public:
				239	template <typename T>
				240	void setup(T num)
				241	{
				242	_num = num;
				243	}
				244
				245	protected:
				246	int _num;
				247	};
				248
				249	FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}))
				250	{
				251	ARM_COMPUTE_ASSERT(_num < 4);
				252	}
				253
				254	@subsubsection tests_overview_test_cases_register_fixture_data_test_case Registering a fixture as data test case
				255
				256	Allows to use a fixture directly as parameterized test case. Instead of
				257	defining a new test function the run method of the fixture will be executed.
				258	The setup method of the fixture needs to be a template and has to accept inputs
				259	from the dataset as arguments. The dataset will be used to generate versions of
				260	the test case with different inputs.
				261
				262	- First argument is the name of the test case (has to be unique within the
				263	enclosing test suite).
				264	- Second argument is the class name of the fixture.
				265	- Third argument is the dataset mode in which the test will be active.
				266	- Fourth argument is the dataset.
				267
				268
				269	class FixtureName : public framework::Fixture
				270	{
				271	public:
				272	template <typename T>
				273	void setup(T num)
				274	{
				275	_num = num;
				276	}
				277
				278	void run() override
				279	{
				280	ARM_COMPUTE_ASSERT(_num < 4);
				281	}
				282
				283	protected:
				284	int _num;
				285	};
				286
				287	REGISTER_FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}));
				288
				289	@section writing_tests Writing validation tests
				290
				291	Before starting a new test case have a look at the existing ones. They should
				292	provide a good overview how test cases are structured.
				293
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	294	- The C++ reference needs to be added to `tests/validation/CPP/`. The
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	295	reference function is typically a template parameterized by the underlying
				296	value type of the `SimpleTensor`. This makes it easy to specialise for
				297	different data types.
				298	- If all backends have a common interface it makes sense to share the setup
				299	code. This can be done by adding a fixture in
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	300	`tests/validation/fixtures/`. Inside of the `setup` method of a fixture
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	301	the tensors can be created and initialised and the function can be configured
				302	and run. The actual test will only have to validate the results. To be shared
				303	among multiple backends the fixture class is usually a template that accepts
				304	the specific types (data, tensor class, function class etc.) as parameters.
				305	- The actual test cases need to be added for each backend individually.
				306	Typically the will be multiple tests for different data types and for
				307	different execution modes, e.g. precommit and nightly.
				308
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	309	<!-- FIXME: Remove before release -->
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	310	@section building_test_dependencies Building dependencies
				311
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	312	@note Only required when tests from the old validation framework need to be run.
				313
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	314	The tests currently make use of Boost (Test and Program options) for
				315	validation. Below are instructions about how to build these 3rd party
				316	libraries.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	317
Anthony Barbier	79c6178	2017-06-23 11:48:24 +0100	[diff] [blame]	318	@note By default the build of the validation and benchmark tests is disabled, to enable it use `validation_tests=1` and `benchmark_tests=1`
				319
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	320	@subsection building_boost Building Boost
				321
				322	First follow the instructions from the Boost library on how to setup the Boost
				323	build system
				324	(http://www.boost.org/doc/libs/1_64_0/more/getting_started/index.html).
				325	Afterwards the required libraries can be build with:
				326
				327	./b2 --with-program_options --with-test link=static \
				328	define=BOOST_TEST_ALTERNATIVE_INIT_API
				329
				330	Additionally, depending on your environment, it might be necessary to specify
				331	the ```toolset=``` option to choose the right compiler. Moreover,
				332	```address-model=32``` can be used to force building for 32bit and
				333	```target-os=android``` must be specified to build for Android.
				334
				335	After executing the build command the libraries
				336	```libboost_program_options.a``` and ```libboost_unit_test_framework.a``` can
				337	be found in ```./stage/lib```.
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	338	<!-- FIXME: end remove -->
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	339
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	340	@section tests_running_tests Running tests
				341	@subsection tests_running_tests_benchmarking Benchmarking
				342	@subsubsection tests_running_tests_benchmarking_filter Filter tests
				343	All tests can be run by invoking
				344
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	345	./arm_compute_benchmark ./data
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	346
				347	where `./data` contains the assets needed by the tests.
				348
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	349	If only a subset of the tests has to be executed the `--filter` option takes a
				350	regular expression to select matching tests.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	351
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame^]	352	./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
				353
				354	@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	355
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	356	Additionally each test has a test id which can be used as a filter, too.
				357	However, the test id is not guaranteed to be stable when new tests are added.
				358	Only for a specific build the same the test will keep its id.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	359
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	360	./arm_compute_benchmark --filter-id=10 ./data
				361
				362	All available tests can be displayed with the `--list-tests` switch.
				363
				364	./arm_compute_benchmark --list-tests
				365
				366	More options can be found in the `--help` message.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	367
				368	@subsubsection tests_running_tests_benchmarking_runtime Runtime
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	369	By default every test is run once on a single thread. The number of iterations
				370	can be controlled via the `--iterations` option and the number of threads via
				371	`--threads`.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	372
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	373	@subsubsection tests_running_tests_benchmarking_output Output
				374	By default the benchmarking results are printed in a human readable format on
				375	the command line. The colored output can be disabled via `--no-color-output`.
				376	As an alternative output format JSON is supported and can be selected via
				377	`--log-format=json`. To write the output to a file instead of stdout the
				378	`--log-file` option can be used.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	379
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	380	@subsubsection tests_running_tests_benchmarking_mode Mode
				381	Tests contain different datasets of different sizes, some of which will take several hours to run.
				382	You can select which datasets to use by using the `--mode` option, we recommed you use `--mode=precommit` to start with.
				383
				384	@subsubsection tests_running_tests_benchmarking_instruments Instruments
				385	You can use the `--instruments` option to select one or more instruments to measure the execution time of the benchmark tests.
				386
				387	`PMU` will try to read the CPU PMU events from the kernel (They need to be enabled on your platform)
				388
				389	`MALI` will try to collect Mali hardware performance counters. (You need to have a recent enough Mali driver)
				390
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame^]	391	`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	392
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame^]	393	You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	394
				395	@note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
				396
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame^]	397	@subsubsection tests_running_examples Examples
				398
				399	To run all the precommit validation tests:
				400
				401	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
				402
				403	To run the OpenCL precommit validation tests:
				404
				405	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
				406
				407	To run the NEON precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
				408
				409	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
				410
				411	To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
				412
				413	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
				414
Moritz Pflanzer	a09de0c	2017-09-01 20:41:12 +0100	[diff] [blame]	415	<!-- FIXME: Remove before release and change above to benchmark and validation -->
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	416	@subsection tests_running_tests_validation Validation
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	417
				418	@note The new validation tests have the same interface as the benchmarking tests.
				419
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	420	@subsubsection tests_running_tests_validation_filter Filter tests
				421	All tests can be run by invoking
				422
				423	./arm_compute_validation -- ./data
				424
				425	where `./data` contains the assets needed by the tests.
				426
				427	As running all tests can take a lot of time the suite is split into "precommit" and "nightly" tests. The precommit tests will be fast to execute but still cover the most important features. In contrast the nightly tests offer more extensive coverage but take longer. The different subsets can be selected from the command line as follows:
				428
				429	./arm_compute_validation -t @precommit -- ./data
				430	./arm_compute_validation -t @nightly -- ./data
				431
				432	Additionally it is possible to select specific suites or tests:
				433
				434	./arm_compute_validation -t CL -- ./data
				435	./arm_compute_validation -t NEON/BitwiseAnd/RunSmall/_0 -- ./data
				436
				437	All available tests can be displayed with the `--list_content` switch.
				438
				439	./arm_compute_validation --list_content -- ./data
				440
				441	For a complete list of possible selectors please see: http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/runtime_config/test_unit_filtering.html
				442
				443	@subsubsection tests_running_tests_validation_verbosity Verbosity
				444	There are two separate flags to control the verbosity of the test output. `--report_level` controls the verbosity of the summary produced after all tests have been executed. `--log_level` controls the verbosity of the information generated during the execution of tests. All available settings can be found in the Boost documentation for [--report_level](http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/utf_reference/rt_param_reference/report_level.html) and [--log_level](http://www.boost.org/doc/libs/1_64_0/libs/test/doc/html/boost_test/utf_reference/rt_param_reference/log_level.html), respectively.
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	445	<!-- FIXME: end remove -->
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	446	*/
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	447	} // namespace test
				448	} // namespace arm_compute