• We propose a wavenumber selection framework based on a wavenumber importance index.
  • The method increases classification performance using a small wavenumber range.
  • Our propositions are applied to Cialis and Viagra FTIR-ATR data.


Attenuated total reflectance (ATR), a sampling technique by Fourier transform infrared (FTIR) spectroscopy, has been adopted as an analytical tool for detecting fraudulent medicines. The spectrum generated by FTIR-ATR typically relies on hundreds of equally spaced wavenumbers which may reduce the performance of techniques tailored to classify samples into classes, i.e., authentic or fraudulent. This paper proposes a novel method for selecting subsets of wavenumbers (variables) that better classify samples into such classes. For that matter, principal components analysis (PCA) is integrated to the k-nearest neighbor (KNN) classification technique. PCA is applied to FTIR-ATR data, and a variable importance index is built on the PCA outputs. An iterative backward variable elimination is started guided by that index; after each variable removal, samples are categorized into authentic or fraudulent classes using KNN, and the classification accuracy is measured. The wavenumber subset compromising high accuracy and reduced percent of retained variables is chosen. When applied to Cialis FTIR-ATR data, the proposed approach retained only average 1.84% of the original variables and increased the classification accuracy average 2.1%, to 0.9897 from 0.9689; as for Viagra data, the method increased average classification accuracy 1.56%, from 0.9135 to 0.9278, using only 7.72% of the original variables.

Anzanello, Michel J., et al. "A multivariate-based wavenumber selection method for classifying medicines into authentic or counterfeit classes." Journal Of Pharmaceutical & Biomedical Analysis 83, (September 2013): 209-214.