Blame - docs/user_guide/tests.dox - ml/ComputeLibrary

blob: 510a1967ae836478b2ba89bfcc2a368fb8682158 [file] [log] [blame]

Vidhya Sudhan Loganathan	d646ae1	2018-11-19 15:18:20 +0000	[diff] [blame]	1	///
Michele Di Giorgio	b43b87a	2021-04-30 12:35:03 +0100	[diff] [blame]	2	/// Copyright (c) 2017-2021 Arm Limited.
Vidhya Sudhan Loganathan	d646ae1	2018-11-19 15:18:20 +0000	[diff] [blame]	3	///
				4	/// SPDX-License-Identifier: MIT
				5	///
				6	/// Permission is hereby granted, free of charge, to any person obtaining a copy
				7	/// of this software and associated documentation files (the "Software"), to
				8	/// deal in the Software without restriction, including without limitation the
				9	/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
				10	/// sell copies of the Software, and to permit persons to whom the Software is
				11	/// furnished to do so, subject to the following conditions:
				12	///
				13	/// The above copyright notice and this permission notice shall be included in all
				14	/// copies or substantial portions of the Software.
				15	///
				16	/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
				17	/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
				18	/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
				19	/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
				20	/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
				21	/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
				22	/// SOFTWARE.
				23	///
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	24	namespace arm_compute
				25	{
				26	namespace test
				27	{
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	28	/**
Sheri Zhang	d813bab	2021-04-30 16:53:41 +0100	[diff] [blame]	29	@page tests Validation and Benchmarks
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	30
				31	@tableofcontents
				32
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	33	@section tests_overview Overview
				34
				35	Benchmark and validation tests are based on the same framework to setup and run
				36	the tests. In addition to running simple, self-contained test functions the
				37	framework supports fixtures and data test cases. The former allows to share
				38	common setup routines between various backends thus reducing the amount of
				39	duplicated code. The latter can be used to parameterize tests or fixtures with
				40	different inputs, e.g. different tensor shapes. One limitation is that
				41	tests/fixtures cannot be parameterized based on the data type if static type
				42	information is needed within the test (e.g. to validate the results).
				43
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	44	@note By default tests are not built. To enable them you need to add validation_tests=1 and / or benchmark_tests=1 to your SCons line.
				45
				46	@note Tests are not included in the pre-built binary archive, you have to build them from sources.
				47
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	48	@subsection tests_overview_fixtures Fixtures
				49
				50	Fixtures can be used to share common setup, teardown or even run tasks among
				51	multiple test cases. For that purpose a fixture can define a `setup`,
				52	`teardown` and `run` method. Additionally the constructor and destructor might
				53	also be customized.
				54
				55	An instance of the fixture is created immediately before the actual test is
				56	executed. After construction the @ref framework::Fixture::setup method is called. Then the test
				57	function or the fixtures `run` method is invoked. After test execution the
				58	@ref framework::Fixture::teardown method is called and lastly the fixture is destructed.
				59
				60	@subsubsection tests_overview_fixtures_fixture Fixture
				61
				62	Fixtures for non-parameterized test are straightforward. The custom fixture
				63	class has to inherit from @ref framework::Fixture and choose to implement any of the
				64	`setup`, `teardown` or `run` methods. None of the methods takes any arguments
				65	or returns anything.
				66
				67	class CustomFixture : public framework::Fixture
				68	{
				69	void setup()
				70	{
				71	_ptr = malloc(4000);
				72	}
				73
				74	void run()
				75	{
				76	ARM_COMPUTE_ASSERT(_ptr != nullptr);
				77	}
				78
				79	void teardown()
				80	{
				81	free(_ptr);
				82	}
				83
				84	void *_ptr;
				85	};
				86
				87	@subsubsection tests_overview_fixtures_data_fixture Data fixture
				88
				89	The advantage of a parameterized fixture is that arguments can be passed to the setup method at runtime. To make this possible the setup method has to be a template with a type parameter for every argument (though the template parameter doesn't have to be used). All other methods remain the same.
				90
				91	class CustomFixture : public framework::Fixture
				92	{
				93	#ifdef ALTERNATIVE_DECLARATION
				94	template <typename ...>
				95	void setup(size_t size)
				96	{
				97	_ptr = malloc(size);
				98	}
				99	#else
				100	template <typename T>
				101	void setup(T size)
				102	{
				103	_ptr = malloc(size);
				104	}
				105	#endif
				106
				107	void run()
				108	{
				109	ARM_COMPUTE_ASSERT(_ptr != nullptr);
				110	}
				111
				112	void teardown()
				113	{
				114	free(_ptr);
				115	}
				116
				117	void *_ptr;
				118	};
				119
				120	@subsection tests_overview_test_cases Test cases
				121
				122	All following commands can be optionally prefixed with `EXPECTED_FAILURE_` or
				123	`DISABLED_`.
				124
				125	@subsubsection tests_overview_test_cases_test_case Test case
				126
				127	A simple test case function taking no inputs and having no (shared) state.
				128
				129	- First argument is the name of the test case (has to be unique within the
				130	enclosing test suite).
				131	- Second argument is the dataset mode in which the test will be active.
				132
				133
				134	TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT)
				135	{
				136	ARM_COMPUTE_ASSERT_EQUAL(1 + 1, 2);
				137	}
				138
				139	@subsubsection tests_overview_test_cases_fixture_fixture_test_case Fixture test case
				140
				141	A simple test case function taking no inputs that inherits from a fixture. The
				142	test case will have access to all public and protected members of the fixture.
				143	Only the setup and teardown methods of the fixture will be used. The body of
				144	this function will be used as test function.
				145
				146	- First argument is the name of the test case (has to be unique within the
				147	enclosing test suite).
				148	- Second argument is the class name of the fixture.
				149	- Third argument is the dataset mode in which the test will be active.
				150
				151
				152	class FixtureName : public framework::Fixture
				153	{
				154	public:
				155	void setup() override
				156	{
				157	_one = 1;
				158	}
				159
				160	protected:
				161	int _one;
				162	};
				163
				164	FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT)
				165	{
				166	ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
				167	}
				168
				169	@subsubsection tests_overview_test_cases_fixture_register_fixture_test_case Registering a fixture as test case
				170
				171	Allows to use a fixture directly as test case. Instead of defining a new test
				172	function the run method of the fixture will be executed.
				173
				174	- First argument is the name of the test case (has to be unique within the
				175	enclosing test suite).
				176	- Second argument is the class name of the fixture.
				177	- Third argument is the dataset mode in which the test will be active.
				178
				179
				180	class FixtureName : public framework::Fixture
				181	{
				182	public:
				183	void setup() override
				184	{
				185	_one = 1;
				186	}
				187
				188	void run() override
				189	{
				190	ARM_COMPUTE_ASSERT_EQUAL(_one + 1, 2);
				191	}
				192
				193	protected:
				194	int _one;
				195	};
				196
				197	REGISTER_FIXTURE_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT);
				198
				199
				200	@subsubsection tests_overview_test_cases_data_test_case Data test case
				201
				202	A parameterized test case function that has no (shared) state. The dataset will
				203	be used to generate versions of the test case with different inputs.
				204
				205	- First argument is the name of the test case (has to be unique within the
				206	enclosing test suite).
				207	- Second argument is the dataset mode in which the test will be active.
				208	- Third argument is the dataset.
				209	- Further arguments specify names of the arguments to the test function. The
				210	number must match the arity of the dataset.
				211
				212
				213	DATA_TEST_CASE(TestCaseName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}), num)
				214	{
				215	ARM_COMPUTE_ASSERT(num < 4);
				216	}
				217
				218	@subsubsection tests_overview_test_cases_fixture_data_test_case Fixture data test case
				219
				220	A parameterized test case that inherits from a fixture. The test case will have
				221	access to all public and protected members of the fixture. Only the setup and
				222	teardown methods of the fixture will be used. The setup method of the fixture
				223	needs to be a template and has to accept inputs from the dataset as arguments.
				224	The body of this function will be used as test function. The dataset will be
				225	used to generate versions of the test case with different inputs.
				226
				227	- First argument is the name of the test case (has to be unique within the
				228	enclosing test suite).
				229	- Second argument is the class name of the fixture.
				230	- Third argument is the dataset mode in which the test will be active.
				231	- Fourth argument is the dataset.
				232
				233
				234	class FixtureName : public framework::Fixture
				235	{
				236	public:
				237	template <typename T>
				238	void setup(T num)
				239	{
				240	_num = num;
				241	}
				242
				243	protected:
				244	int _num;
				245	};
				246
				247	FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}))
				248	{
				249	ARM_COMPUTE_ASSERT(_num < 4);
				250	}
				251
				252	@subsubsection tests_overview_test_cases_register_fixture_data_test_case Registering a fixture as data test case
				253
				254	Allows to use a fixture directly as parameterized test case. Instead of
				255	defining a new test function the run method of the fixture will be executed.
				256	The setup method of the fixture needs to be a template and has to accept inputs
				257	from the dataset as arguments. The dataset will be used to generate versions of
				258	the test case with different inputs.
				259
				260	- First argument is the name of the test case (has to be unique within the
				261	enclosing test suite).
				262	- Second argument is the class name of the fixture.
				263	- Third argument is the dataset mode in which the test will be active.
				264	- Fourth argument is the dataset.
				265
				266
				267	class FixtureName : public framework::Fixture
				268	{
				269	public:
				270	template <typename T>
				271	void setup(T num)
				272	{
				273	_num = num;
				274	}
				275
				276	void run() override
				277	{
				278	ARM_COMPUTE_ASSERT(_num < 4);
				279	}
				280
				281	protected:
				282	int _num;
				283	};
				284
				285	REGISTER_FIXTURE_DATA_TEST_CASE(TestCaseName, FixtureName, DatasetMode::PRECOMMIT, framework::make("Numbers", {1, 2, 3}));
				286
				287	@section writing_tests Writing validation tests
				288
				289	Before starting a new test case have a look at the existing ones. They should
				290	provide a good overview how test cases are structured.
				291
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	292	- The C++ reference needs to be added to `tests/validation/CPP/`. The
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	293	reference function is typically a template parameterized by the underlying
				294	value type of the `SimpleTensor`. This makes it easy to specialise for
				295	different data types.
				296	- If all backends have a common interface it makes sense to share the setup
				297	code. This can be done by adding a fixture in
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	298	`tests/validation/fixtures/`. Inside of the `setup` method of a fixture
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	299	the tensors can be created and initialised and the function can be configured
				300	and run. The actual test will only have to validate the results. To be shared
				301	among multiple backends the fixture class is usually a template that accepts
				302	the specific types (data, tensor class, function class etc.) as parameters.
				303	- The actual test cases need to be added for each backend individually.
				304	Typically the will be multiple tests for different data types and for
				305	different execution modes, e.g. precommit and nightly.
				306
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	307	@section tests_running_tests Running tests
Anthony Barbier	38e7f1f	2018-05-21 13:37:47 +0100	[diff] [blame]	308	@subsection tests_running_tests_benchmark_and_validation Benchmarking and validation suites
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	309	@subsubsection tests_running_tests_benchmarking_filter Filter tests
				310	All tests can be run by invoking
				311
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	312	./arm_compute_benchmark ./data
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	313
				314	where `./data` contains the assets needed by the tests.
				315
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	316	If only a subset of the tests has to be executed the `--filter` option takes a
				317	regular expression to select matching tests.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	318
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	319	./arm_compute_benchmark --filter='^NEON/.*AlexNet' ./data
				320
				321	@note Filtering will be much faster if the regular expression starts from the start ("^") or end ("$") of the line.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	322
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	323	Additionally each test has a test id which can be used as a filter, too.
				324	However, the test id is not guaranteed to be stable when new tests are added.
				325	Only for a specific build the same the test will keep its id.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	326
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	327	./arm_compute_benchmark --filter-id=10 ./data
				328
				329	All available tests can be displayed with the `--list-tests` switch.
				330
				331	./arm_compute_benchmark --list-tests
				332
				333	More options can be found in the `--help` message.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	334
				335	@subsubsection tests_running_tests_benchmarking_runtime Runtime
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	336	By default every test is run once on a single thread. The number of iterations
				337	can be controlled via the `--iterations` option and the number of threads via
				338	`--threads`.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	339
Moritz Pflanzer	2b26b85	2017-07-21 10:09:30 +0100	[diff] [blame]	340	@subsubsection tests_running_tests_benchmarking_output Output
				341	By default the benchmarking results are printed in a human readable format on
				342	the command line. The colored output can be disabled via `--no-color-output`.
				343	As an alternative output format JSON is supported and can be selected via
				344	`--log-format=json`. To write the output to a file instead of stdout the
				345	`--log-file` option can be used.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	346
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	347	@subsubsection tests_running_tests_benchmarking_mode Mode
				348	Tests contain different datasets of different sizes, some of which will take several hours to run.
				349	You can select which datasets to use by using the `--mode` option, we recommed you use `--mode=precommit` to start with.
				350
				351	@subsubsection tests_running_tests_benchmarking_instruments Instruments
				352	You can use the `--instruments` option to select one or more instruments to measure the execution time of the benchmark tests.
				353
				354	`PMU` will try to read the CPU PMU events from the kernel (They need to be enabled on your platform)
				355
Michele Di Giorgio	33f41fa	2021-03-09 14:09:08 +0000	[diff] [blame]	356	`MALI` will try to collect Arm® Mali™ hardware performance counters. (You need to have a recent enough Arm® Mali™ driver)
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	357
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	358	`WALL_CLOCK_TIMER` will measure time using `gettimeofday`: this should work on all platforms.
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	359
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	360	You can pass a combinations of these instruments: `--instruments=PMU,MALI,WALL_CLOCK_TIMER`
Anthony Barbier	144d2ff	2017-09-29 10:46:08 +0100	[diff] [blame]	361
				362	@note You need to make sure the instruments have been selected at compile time using the `pmu=1` or `mali=1` scons options.
				363
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	364	@subsubsection tests_running_examples Examples
				365
				366	To run all the precommit validation tests:
				367
				368	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit
				369
				370	To run the OpenCL precommit validation tests:
				371
				372	LD_LIBRARY_PATH=. ./arm_compute_validation --mode=precommit --filter="^CL.*"
				373
Michele Di Giorgio	33f41fa	2021-03-09 14:09:08 +0000	[diff] [blame]	374	To run the Arm® Neon™ precommit benchmark tests with PMU and Wall Clock timer in miliseconds instruments enabled:
Anthony Barbier	cc0a80b	2017-12-15 11:37:29 +0000	[diff] [blame]	375
				376	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^NEON.*" --instruments="pmu,wall_clock_timer_ms" --iterations=10
				377
				378	To run the OpenCL precommit benchmark tests with OpenCL kernel timers in miliseconds enabled:
				379
				380	LD_LIBRARY_PATH=. ./arm_compute_benchmark --mode=precommit --filter="^CL.*" --instruments="opencl_timer_ms" --iterations=10
				381
Georgios Pinitas	5821632	2020-02-26 11:13:13 +0000	[diff] [blame]	382	@note You might need to export the path to OpenCL library as well in your LD_LIBRARY_PATH if Compute Library was built with OpenCL enabled.
Anthony Barbier	6ff3b19	2017-09-04 18:44:23 +0100	[diff] [blame]	383	*/
Moritz Pflanzer	e3e7345	2017-08-11 15:40:16 +0100	[diff] [blame]	384	} // namespace test
				385	} // namespace arm_compute