Rough Set Theory (RST)

Project Description

This project implements a machine learning algorithm based on Zdzislaw Pawlak's Rough Set Theory to predict golf performance based on weather conditions.

Project structure

The project consists of the following files:

Train_data_golf_14ex.csv: Training dataset.
Test_data_golf_50ex.csv: Test dataset.
algorithm.py: The main script with the implementation of the algorithm.

Installation

Clone the repository:

git clone https://github.com/your-username/rst-golf-prediction.git

Go to your project folder:

cd rst-golf-prediction

Install required dependencies:

pip install pandas

Using the algorithm

Place your CSV data files in your project root folder.
For correct operation specify the path to the test and training dataset depending on its location on your computer

df_path = 'Put your personal path here'

df_test_path = 'Put your personal path here too'

Run the script RS-ML.py

python RS-ML.py

Example of work

Outlook	Humidity %	Wind	Play
Overcast	87	Fasle	Yes
Sunny	80	True	Yes
Sunny	80	True	Yes
Overcast	75	True	Yes
Overcast	75	True	Yes
Rainy	80	False	No
Sunny	80	True	No
Rainy	80	False	No
Rainy	85	False	No
Overcast	87	False	Yes

After launch we get the following intermediate results, which represent the construction of production rules:

Getting an elementary subsets of dataset:
[[0, 9], [1, 2, 6], [3, 4], [5, 7], [8]]
[[0, 9], [3, 4]]

======== Production rules for positive region ========
1) IF (Outlook = Overcast)& (Humidity% = 87 & 75)& (Wind = False & True)& THEN DECISION "PLAY" = PLAY

======== Production rules for negative region ========
2) IF (Outlook = Rainy)&(Humidity% = 85 V 80)&(Wind = False) THEN DECISION "PLAY" = DON'T PLAY

======== Production rules for boundry region ========
3) IF (Outlook = Sunny)&(Humidity% = 80)&(Wind = True) THEN DECISION "PLAY" = MAYBE PLAY

Approximation accuracy: 0.571

The final result will be the classification of the test dataset based on the constructed rules, as well as a comparison of the classification of the algorithm with the true values.

Outlook	Humidity %	Wind	Play	Classification
Overcast	87	Fasle	Yes	Yes
Sunny	80	True	Yes	Maybe
Rainy	80	True	Yes	Unknown
Sunny	75	True	Yes	Maybe
NaN	75	True	Yes	Unknown
Overcast	80	False	No	Yes
Raqiny	80	True	No	No

Accuracy of the classification RS1: 42.9 %

Code Structure

The main implemented functions of the algorithm are:

get_elementary_subsets(X): A function that returns elementary subsets of a set of objects.
get_lower(elementary, X_true_indexes): Formation of lower approximation.
get_upper(elementary, X_true_indexes): Formation of upper approximation.
get_pos_rule(pos_dataframe): Creating production rules for upper approximation.
get_neg_rule(not_pos_dataframe): Creating production rules for lower approximation.
get_maybe_rule(maybe_dataframe): Creating production rules for boundry region.
classify_new_data(row, pos_df, maybe_df, neg_df): Classification of a test data set based on constructed rules.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
RS_ML.py		RS_ML.py
Test_data_golf.csv		Test_data_golf.csv
Test_data_golf_50ex.csv		Test_data_golf_50ex.csv
Train_data_golf_14ex.csv		Train_data_golf_14ex.csv
golf.csv		golf.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rough Set Theory (RST)

Project Description

Project structure

Installation

Using the algorithm

Example of work

Code Structure

About

Uh oh!

Releases

Languages

SayMyName1337/RST-RS1-algorithm

Folders and files

Latest commit

History

Repository files navigation

Rough Set Theory (RST)

Project Description

Project structure

Installation

Using the algorithm

Example of work

Code Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Languages