NLP Approach For Discrimination Prevention In Data Mining
Data mining is an increasingly important technology for extracting useful knowledge that is hidden in large
collections of data. There are however negative social realization about data mining, which consist of mainly potential
privacy incursion and potential discrimination. Discrimination consist of unfairly treating people on the basis of belonging to
specific group. Now a days for taking automated decisions automated data collection and data mining techniques are used
like loan granting/denial, insurance premium calculations, etc. If the training data sets which we are using, are biased in what
regards discriminatory attributes like gender, age, religion, nationality etc., discriminatory decisions may occur. For this
reason, anti discrimination techniques such as discrimination discovery and discrimination prevention have been introduced
in data mining. Discrimination can be of two types, Direct or Indirect. Direct description is based on sensitive attributes.
While on the other hand Indirect discrimination occurs when decisions are based on non sensitive attributes which are co
related to sensitive attributes. In this paper we tackle discrimination prevention in data mining and propose new techniques
applicable for direct or indirect discrimination prevention individually or both at the same time. For this we use Natural
Language Processing approach.