Merge pull request #12 from SC-SGS/fix_codacy_issues

vancraar · web-flow · commit f17ad6205919 · 2022-03-07T09:20:48.000+01:00
Fix codacy issues
diff --git a/LICENSE.md b/LICENSE.md
@@ -1,4 +1,4 @@
-MIT License
+# MIT License
 
 Copyright (c) 2021 Alexander Van Craen and Marcel Breyer @ University of Stuttgart
 
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@
 [![Codacy Badge](https://app.codacy.com/project/badge/Grade/e780a63075ce40c29c49d3df4f57c2af)](https://www.codacy.com/gh/SC-SGS/PLSSVM/dashboard?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=SC-SGS/PLSSVM&amp;utm_campaign=Badge_Grade) &ensp; [![Generate documentation](https://github.com/SC-SGS/PLSSVM/actions/workflows/documentation.yml/badge.svg)](https://sc-sgs.github.io/PLSSVM/) &ensp; [![Build Status Linux CPU + GPU](https://simsgs.informatik.uni-stuttgart.de/jenkins/buildStatus/icon?job=PLSSVM%2FMultibranch-Github%2Fmain&subject=Linux+CPU/GPU)](https://simsgs.informatik.uni-stuttgart.de/jenkins/view/PLSSVM/job/PLSSVM/job/Multibranch-Github/job/main/) &ensp; [![Windows CPU](https://github.com/SC-SGS/PLSSVM/actions/workflows/msvc_windows.yml/badge.svg)](https://github.com/SC-SGS/PLSSVM/actions/workflows/msvc_windows.yml)
 
 A [Support Vector Machine (SVM)](https://en.wikipedia.org/wiki/Support-vector_machine) is a supervised machine learning model.
-In its basic form SVMs are used for binary classification tasks. 
+In its basic form SVMs are used for binary classification tasks.
 Their fundamental idea is to learn a hyperplane which separates the two classes best, i.e., where the widest possible margin around its decision boundary is free of data.
 This is also the reason, why SVMs are also called "large margin classifiers".
 To predict to which class a new, unseen data point belongs, the SVM simply has to calculate on which side of the previously calculated hyperplane the data point lies.
@@ -28,40 +28,46 @@ We decided to use the [Conjugate Gradient (CG)](https://en.wikipedia.org/wiki/Co
 Since one of our main goals was performance, we parallelized the implicit matrix-vector multiplication inside the CG algorithm.
 To do so, we use multiple different frameworks to be able to target a broad variety of different hardware platforms.
 The currently available frameworks (also called backends in our PLSSVM implementation) are:
-  - [OpenMP](https://www.openmp.org/)
-  - [CUDA](https://developer.nvidia.com/cuda-zone)
-  - [OpenCL](https://www.khronos.org/opencl/)
-  - [SYCL](https://www.khronos.org/sycl/) (tested implementations are [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL))
+
+- [OpenMP](https://www.openmp.org/)
+- [CUDA](https://developer.nvidia.com/cuda-zone)
+- [OpenCL](https://www.khronos.org/opencl/)
+- [SYCL](https://www.khronos.org/sycl/) (tested implementations are [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL))
 
 ## Getting Started
 
 ### Dependencies
 
 General dependencies:
-  - a C++17 capable compiler (e.g. [`gcc`](https://gcc.gnu.org/) or [`clang`](https://clang.llvm.org/))
-  - [CMake](https://cmake.org/) 3.18 or newer
-  - [cxxopts](https://github.com/jarro2783/cxxopts), [fast_float](https://github.com/fastfloat/fast_float) and [{fmt}](https://github.com/fmtlib/fmt) (all three are automatically build during the CMake configuration if they couldn't be found using the respective `find_package` call)
-  - [GoogleTest](https://github.com/google/googletest) if testing is enabled (automatically build during the CMake configuration if `find_package(GTest)` wasn't successful)
-  - [doxygen](https://www.doxygen.nl/index.html) if documentation generation is enabled
-  - [OpenMP](https://www.openmp.org/) 4.0 or newer (optional) to speed-up file parsing
-  - multiple Python modules used in the utility scripts; <br>to install all modules use `pip install --user -r install/python_requirements.txt`
+
+- a C++17 capable compiler (e.g. [`gcc`](https://gcc.gnu.org/) or [`clang`](https://clang.llvm.org/))
+- [CMake](https://cmake.org/) 3.18 or newer
+- [cxxopts](https://github.com/jarro2783/cxxopts), [fast_float](https://github.com/fastfloat/fast_float) and [{fmt}](https://github.com/fmtlib/fmt) (all three are automatically build during the CMake configuration if they couldn't be found using the respective `find_package` call)
+- [GoogleTest](https://github.com/google/googletest) if testing is enabled (automatically build during the CMake configuration if `find_package(GTest)` wasn't successful)
+- [doxygen](https://www.doxygen.nl/index.html) if documentation generation is enabled
+- [OpenMP](https://www.openmp.org/) 4.0 or newer (optional) to speed-up file parsing
+- multiple Python modules used in the utility scripts, to install all modules use `pip install --user -r install/python_requirements.txt`
 
 Additional dependencies for the OpenMP backend:
-  - compiler with OpenMP support
+
+- compiler with OpenMP support
 
 Additional dependencies for the CUDA backend:
-  - CUDA SDK
-  - either NVIDIA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) or [`clang` with CUDA support enabled](https://llvm.org/docs/CompileCudaWithLLVM.html)
+
+- CUDA SDK
+- either NVIDIA [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) or [`clang` with CUDA support enabled](https://llvm.org/docs/CompileCudaWithLLVM.html)
 
 Additional dependencies for the OpenCL backend:
-  - OpenCL runtime and header files
+
+- OpenCL runtime and header files
 
 Additional dependencies for the SYCL backend:
-  - the code must be compiled with a SYCL capable compiler; currently tested with [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL)
+
+- the code must be compiled with a SYCL capable compiler; currently tested with [DPC++](https://github.com/intel/llvm) and [hipSYCL](https://github.com/illuhad/hipSYCL)
 
 Additional dependencies if `PLSSVM_ENABLE_TESTING` and `PLSSVM_GENERATE_TEST_FILE` are both set to `ON`:
-  - [Python3](https://www.python.org/) with the [`argparse`](https://docs.python.org/3/library/argparse.html), [`timeit`](https://docs.python.org/3/library/timeit.html) and [`sklearn`](https://scikit-learn.org/stable/) modules
 
+- [Python3](https://www.python.org/) with the [`argparse`](https://docs.python.org/3/library/argparse.html), [`timeit`](https://docs.python.org/3/library/timeit.html) and [`sklearn`](https://scikit-learn.org/stable/) modules
 
 ### Building
 
@@ -79,17 +85,18 @@ cmake --build .
 
 The **required** CMake option `PLSSVM_TARGET_PLATFORMS` is used to determine for which targets the backends should be compiled.
 Valid targets are:
-  - `cpu`: compile for the CPU; an **optional** architectural specifications is allowed but only used when compiling with DPC++, e.g., `cpu:avx2`
-  - `nvidia`: compile for NVIDIA GPUs; **at least one** architectural specification is necessary, e.g., `nvidia:sm_86,sm_70`
-  - `amd`: compile for AMD GPUs; **at least one** architectural specification is necessary, e.g., `amd:gfx906`
-  - `intel`: compile for Intel GPUs; **at least one** architectural specification is necessary, e.g., `intel:skl`
+
+- `cpu`: compile for the CPU; an **optional** architectural specifications is allowed but only used when compiling with DPC++, e.g., `cpu:avx2`
+- `nvidia`: compile for NVIDIA GPUs; **at least one** architectural specification is necessary, e.g., `nvidia:sm_86,sm_70`
+- `amd`: compile for AMD GPUs; **at least one** architectural specification is necessary, e.g., `amd:gfx906`
+- `intel`: compile for Intel GPUs; **at least one** architectural specification is necessary, e.g., `intel:skl`
 
 At least one of the above targets must be present.
 
 Note that when using DPC++ only a single architectural specification for `cpu` or `amd` is allowed.
 
 To retrieve the architectural specifications of the current system, a simple Python3 script `utility/plssvm_target_platforms.py` is provided
-(required Python3 dependencies: 
+(required Python3 dependencies:
 [`argparse`](https://docs.python.org/3/library/argparse.html), [`py-cpuinfo`](https://pypi.org/project/py-cpuinfo/),
 [`GPUtil`](https://pypi.org/project/GPUtil/), [`pyamdgpuinfo`](https://pypi.org/project/pyamdgpuinfo/), and
 [`pylspci`](https://pypi.org/project/pylspci/))
@@ -120,46 +127,50 @@ cpu:avx512;intel:dg1
 ```
 
 If the architectural information for the requested GPU could not be retrieved, one option would be to have a look at:
-  - for NVIDIA GPUs:  [Your GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)
-  - for AMD GPUs: [clang AMDGPU backend usage](https://llvm.org/docs/AMDGPUUsage.html)
-  - for Intel GPUs and CPUs: [Ahead of Time Compilation](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compilation/ahead-of-time-compilation.html) and [Intel graphics processor table](https://dgpu-docs.intel.com/devices/hardware-table.html)
 
+- for NVIDIA GPUs:  [Your GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)
+- for AMD GPUs: [clang AMDGPU backend usage](https://llvm.org/docs/AMDGPUUsage.html)
+- for Intel GPUs and CPUs: [Ahead of Time Compilation](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compilation/ahead-of-time-compilation.html) and [Intel graphics processor table](https://dgpu-docs.intel.com/devices/hardware-table.html)
 
 #### Optional CMake Options
 
 The `[optional_options]` can be one or multiple of:
 
-  - `PLSSVM_ENABLE_OPENMP_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
-    - `ON`: check for the OpenMP backend and fail if not available
-    - `AUTO`: check for the OpenMP backend but **do not** fail if not available
-    - `OFF`: do not check for the OpenMP backend
-  - `PLSSVM_ENABLE_CUDA_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
-    - `ON`: check for the CUDA backend and fail if not available
-    - `AUTO`: check for the CUDA backend but **do not** fail if not available
-    - `OFF`: do not check for the CUDA backend
-  - `PLSSVM_ENABLE_OPENCL_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
-    - `ON`: check for the OpenCL backend and fail if not available
-    - `AUTO`: check for the OpenCL backend but **do not** fail if not available
-    - `OFF`: do not check for the OpenCL backend
-  - `PLSSVM_ENABLE_SYCL_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
-    - `ON`: check for the SYCL backend and fail if not available
-    - `AUTO`: check for the SYCL backend but **do not** fail if not available
-    - `OFF`: do not check for the SYCL backend
+- `PLSSVM_ENABLE_OPENMP_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
+  - `ON`: check for the OpenMP backend and fail if not available
+  - `AUTO`: check for the OpenMP backend but **do not** fail if not available
+  - `OFF`: do not check for the OpenMP backend
+
+- `PLSSVM_ENABLE_CUDA_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
+  - `ON`: check for the CUDA backend and fail if not available
+  - `AUTO`: check for the CUDA backend but **do not** fail if not available
+  - `OFF`: do not check for the CUDA backend
+
+- `PLSSVM_ENABLE_OPENCL_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
+  - `ON`: check for the OpenCL backend and fail if not available
+  - `AUTO`: check for the OpenCL backend but **do not** fail if not available
+  - `OFF`: do not check for the OpenCL backend
+
+- `PLSSVM_ENABLE_SYCL_BACKEND=ON|OFF|AUTO` (default: `AUTO`):
+  - `ON`: check for the SYCL backend and fail if not available
+  - `AUTO`: check for the SYCL backend but **do not** fail if not available
+  - `OFF`: do not check for the SYCL backend
 
 **Attention:** at least one backend must be enabled and available!
 
-  - `PLSSVM_ENABLE_ASSERTS=ON|OFF` (default: `OFF`): enables custom assertions regardless whether the `DEBUG` macro is defined or not
-  - `PLSSVM_THREAD_BLOCK_SIZE` (default: `16`): set a specific thread block size used in the GPU kernels (for fine-tuning optimizations)
-  - `PLSSVM_INTERNAL_BLOCK_SIZE` (default: `6`: set a specific internal block size used in the GPU kernels (for fine-tuning optimizations)
-  - `PLSSVM_EXECUTABLES_USE_SINGLE_PRECISION` (default: `OFF`): enables single precision calculations instead of double precision for the `svm-train` and `svm-predict` executables
-  - `PLSSVM_ENABLE_LTO=ON|OFF` (default: `ON`): enable interprocedural optimization (IPO/LTO) if supported by the compiler
-  - `PLSSVM_ENABLE_DOCUMENTATION=ON|OFF` (default: `OFF`): enable the `doc` target using doxygen
-  - `PLSSVM_ENABLE_TESTING=ON|OFF` (default: `ON`): enable testing using GoogleTest and ctest
-  - `PLSSVM_GENERATE_TIMING_SCRIPT=ON|OFF` (default: `OFF`): configure a timing script usable for performance measurement
+- `PLSSVM_ENABLE_ASSERTS=ON|OFF` (default: `OFF`): enables custom assertions regardless whether the `DEBUG` macro is defined or not
+- `PLSSVM_THREAD_BLOCK_SIZE` (default: `16`): set a specific thread block size used in the GPU kernels (for fine-tuning optimizations)
+- `PLSSVM_INTERNAL_BLOCK_SIZE` (default: `6`: set a specific internal block size used in the GPU kernels (for fine-tuning optimizations)
+- `PLSSVM_EXECUTABLES_USE_SINGLE_PRECISION` (default: `OFF`): enables single precision calculations instead of double precision for the `svm-train` and `svm-predict` executables
+- `PLSSVM_ENABLE_LTO=ON|OFF` (default: `ON`): enable interprocedural optimization (IPO/LTO) if supported by the compiler
+- `PLSSVM_ENABLE_DOCUMENTATION=ON|OFF` (default: `OFF`): enable the `doc` target using doxygen
+- `PLSSVM_ENABLE_TESTING=ON|OFF` (default: `ON`): enable testing using GoogleTest and ctest
+- `PLSSVM_GENERATE_TIMING_SCRIPT=ON|OFF` (default: `OFF`): configure a timing script usable for performance measurement
 
 If `PLSSVM_ENABLE_TESTING` is set to `ON`, the following options can also be set:
-  - `PLSSVM_GENERATE_TEST_FILE=ON|OFF` (default: `ON`): automatically generate test files
-    - `PLSSVM_TEST_FILE_NUM_DATA_POINTS` (default: `5000`): the number of data points in the test file
+
+- `PLSSVM_GENERATE_TEST_FILE=ON|OFF` (default: `ON`): automatically generate test files
+  - `PLSSVM_TEST_FILE_NUM_DATA_POINTS` (default: `5000`): the number of data points in the test file
 
 If the SYCL backend is available and DPC++ is used, the option `PLSSVM_SYCL_DPCPP_USE_LEVEL_ZERO` can be used to select Level-Zero as the
 DPC++ backend instead of OpenCL.
@@ -190,9 +201,11 @@ The resulting `html` coverage report is located in the `coverage` folder in the
 ### Creating the documentation
 
 If doxygen is installed and `PLSSVM_ENABLE_DOCUMENTATION` is set to `ON` the documentation can be build using
+
 ```bash
 make doc
 ```
+
 The documentation of the current state of the main branch can be found [here](https://sc-sgs.github.io/PLSSVM/).
 
 ## Installing
@@ -211,8 +224,8 @@ The repository comes with a Python3 script (in the `utility_scripts/` directory)
 
 In order to use all functionality, the following Python3 modules must be installed:
 [`argparse`](https://docs.python.org/3/library/argparse.html), [`timeit`](https://docs.python.org/3/library/timeit.html),
-[`numpy`](https://pypi.org/project/numpy/), [`pandas`](https://pypi.org/project/pandas/), 
-[`sklearn`](https://scikit-learn.org/stable/), [`arff`](https://pypi.org/project/arff/), 
+[`numpy`](https://pypi.org/project/numpy/), [`pandas`](https://pypi.org/project/pandas/),
+[`sklearn`](https://scikit-learn.org/stable/), [`arff`](https://pypi.org/project/arff/),
 [`matplotlib`](https://pypi.org/project/matplotlib/) and
 [`mpl_toolkits`](https://pypi.org/project/matplotlib/)
 
@@ -374,7 +387,6 @@ target_compile_features(prog PUBLIC cxx_std_17)
 target_link_libraries(prog PUBLIC plssvm::svm-all)
 ```
 
-
 ## License
 
 The PLSSVM library is distributed under the MIT [license](https://github.com/SC-SGS/PLSSVM/blob/main/LICENSE.md).
diff --git a/src/plssvm/parameter.cpp b/src/plssvm/parameter.cpp
@@ -425,7 +425,7 @@ void parameter<T>::parse_model_file(const std::string &filename) {
             } else if (detail::starts_with(line, "total_sv")) {
                 // the total number of support vectors must be greater than 0
                 num_sv = detail::convert_to<decltype(num_sv)>(value);
-                if (num_sv <= 0) {
+                if (num_sv == 0) {
                     throw invalid_file_format_exception{ fmt::format("The number of support vectors must be greater than 0, but is {}!", num_sv) };
                 }
             } else if (detail::starts_with(line, "rho")) {
diff --git a/tests/backends/generic_tests.hpp b/tests/backends/generic_tests.hpp
@@ -25,12 +25,13 @@
 #include "fmt/format.h"   // fmt::format
 #include "fmt/ostream.h"  // can use fmt using operator<< overloads
 #include "gmock/gmock.h"  // EXPECT_THAT
-#include "gtest/gtest.h"  // GTEST_USES_POSIX_RE, ASSERT_EQ, EXPECT_EQ, EXPECT_GT, testing::ContainsRegex, testing::StaticAssertTypeEq
+#include "gtest/gtest.h"  // ASSERT_GT, ASSERT_TRUE, ASSERT_EQ, EXPECT_EQ, EXPECT_GT, testing::ContainsRegex, testing::StaticAssertTypeEq
 
 #include <algorithm>   // std::generate
 #include <filesystem>  // std::filesystem::remove
 #include <fstream>     // std::ifstream
 #include <random>      // std::random_device, std::mt19937, std::uniform_real_distribution
+#include <regex>       // std::regex, std::regex_match
 #include <string>      // std::string, std::getline
 #include <vector>      // std::vector
 
@@ -69,26 +70,48 @@ inline void write_model_test() {
     // write learned model to file
     csvm.write_model(model_file);
 
-    // read content of model file and delete it
-    std::ifstream model_ifs(model_file);
-    std::string file_content((std::istreambuf_iterator<char>(model_ifs)), std::istreambuf_iterator<char>());
-    model_ifs.close();
+    // read content of model file line by line and delete it
+    std::vector<std::string> lines;
+    {
+        std::ifstream model_ifs(model_file);
+        std::string line;
+        while (std::getline(model_ifs, line)) {
+            lines.push_back(std::move(line));
+        }
+    }
     std::filesystem::remove(model_file);
 
-    // check model file content for correctness
-#ifdef GTEST_USES_POSIX_RE
+    // create vector containing correct regex
+    std::vector<std::string> regex_patterns;
+    regex_patterns.emplace_back("svm_type c_svc");
+    regex_patterns.emplace_back(fmt::format("kernel_type {}", params.kernel));
     switch (params.kernel) {
         case plssvm::kernel_type::linear:
-            EXPECT_THAT(file_content, testing::ContainsRegex("^svm_type c_svc\nkernel_type linear\nnr_class 2\ntotal_sv [0-9]+\nrho [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\nlabel 1 -1\nnr_sv [0-9]+ [0-9]+\nSV"));
             break;
         case plssvm::kernel_type::polynomial:
-            EXPECT_THAT(file_content, testing::ContainsRegex("^svm_type c_svc\nkernel_type polynomial\ndegree [0-9]+\ngamma [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\ncoef0 [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\nnr_class 2\ntotal_sv [0-9]+\nrho [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\nlabel 1 -1\nnr_sv [0-9]+ [0-9]+\nSV"));
+            regex_patterns.emplace_back("degree [0-9]+");
+            regex_patterns.emplace_back("gamma [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?");
+            regex_patterns.emplace_back("coef0 [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?");
             break;
         case plssvm::kernel_type::rbf:
-            EXPECT_THAT(file_content, testing::ContainsRegex("^svm_type c_svc\nkernel_type rbf\ngamma [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\nnr_class 2\ntotal_sv [0-9]+\nrho [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?\nlabel 1 -1\nnr_sv [0-9]+ [0-9]+\nSV"));
+            regex_patterns.emplace_back("gamma [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?");
             break;
     }
-#endif
+    regex_patterns.emplace_back("nr_class 2");
+    regex_patterns.emplace_back("total_sv [0-9]+");
+    regex_patterns.emplace_back("rho [-+]?[0-9]*.?[0-9]+([eE][-+]?[0-9]+)?");
+    regex_patterns.emplace_back("label 1 -1");
+    regex_patterns.emplace_back("nr_sv [0-9]+ [0-9]+");
+    regex_patterns.emplace_back("SV");
+
+    // at least number of header entries lines must be present
+    ASSERT_GT(lines.size(), regex_patterns.size());
+
+    // check if the model header is valid
+    for (std::vector<std::string>::size_type i = 0; i < regex_patterns.size(); ++i) {
+        std::regex reg(regex_patterns[i], std::regex::extended);
+        ASSERT_TRUE(std::regex_match(lines[i], reg)) << "line: " << i << " doesn't match regex pattern: " << regex_patterns[i];
+    }
 }
 
 template <template <typename> typename csvm_type, typename real_type, plssvm::kernel_type kernel>

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-MIT License`
	`1`	`+# MIT License`
`2`	`2`
`3`	`3`	`Copyright (c) 2021 Alexander Van Craen and Marcel Breyer @ University of Stuttgart`
`4`	`4`
Original file line number	Diff line number	Diff line change
`@@ -425,7 +425,7 @@ void parameter<T>::parse_model_file(const std::string &filename) {`
`425`	`425`	`} else if (detail::starts_with(line, "total_sv")) {`
`426`	`426`	`// the total number of support vectors must be greater than 0`
`427`	`427`	`num_sv = detail::convert_to<decltype(num_sv)>(value);`
`428`		`- if (num_sv <= 0) {`
	`428`	`+ if (num_sv == 0) {`
`429`	`429`	`throw invalid_file_format_exception{ fmt::format("The number of support vectors must be greater than 0, but is {}!", num_sv) };`
`430`	`430`	`}`
`431`	`431`	`} else if (detail::starts_with(line, "rho")) {`