2- The x "features and the y "output preperations"/||| X = data.drop(['late_aircraft_ct'],axis = 1,inplace = False)
y = data['late_aircraft_ct'] #selected that is my needed output # Overwrite
3- From sklearn.model_selection import train_test_split #split the parts for traning and tests after traning from the data...
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=44, shuffle=True)
#Splitted Data,#shape = the num of rows and num of columes in the file
print('X_train shape is ' , X_train.shape)
print('X_test shape is ' , X_test.shape)
print('y_train shape is ' , y_train.shape)
print('y_test shape is ' , y_test.shape)
-X_train = is the data that will train the model, X_test = the data that will test accuracy if the model after traning.
-y_train = the output that will train the model above it, y_test is the output that will test the accuracy of the modell output after traning.
-test_size=0.25 = is the size of the data from the file that will use in test the accuracy of the model after training it, 25% of the tootal data.
-shuffle=True = is like mixing the data to take a sample of all data in traning and testing
-random_state=44 = is use to fixed the randomization in the traning.
4- Using a model and shoud know that their is other models ....
from sklearn.linear_model import LinearRegression # LinearRegression is a model there are lot other it.
from sklearn.preprocessing import StandardScaler #is to make all data at the same shape "math" particualry.
from sklearn.pipeline import make_pipeline #Makes it easy to create a sequence of operations without need to write separate steps for each one.
LinearRegressionModel = make_pipeline(StandardScaler(), LinearRegression(fit_intercept=True, copy_X=True))
print(LinearRegressionModel)
5- LinearRegressionModel.fit(X_train, y_train) #Makes the model learn the relationship between X_train and y_train.
6- y_pred = LinearRegressionModel.predict(X_test)
print(y_pred)
#show the result of y^ that come from the x_test which the x .25% of the data that not use in the traning.
7- From sklearn.metrics import mean_squared_error #only a tool to calculate the cost function and there is tools other it searching.
MSEValue = mean_squared_error(y_test, y_pred, multioutput='uniform_average') # it can be raw_values
print('Mean Squared Error Value is : ', MSEValue)
trans str to float:
from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
for col in ['Brand', 'Model', 'Fuel_Type', 'Transmission']:
data[col] = encoder.fit_transform(data[col])
print(data.head()) # سترى أن القيم النصية تØÙˆÙ„ت إلى أرقام
0 Comments