18
Ninth Montreal Industrial Problem Solving Workshop: Spilling problem Ismael Assani, Poclaire Kenmogne, Jiliang Li, Gabriel Lemeyre, Thi Thanh Hue Nguyen, Frédérique Robin, Pierre-Loïk Rothé August 26, 2019 Under the supervision of François Bellavance (HEC) & Olivier G. Leblanc (Air Canada) Ninth, IPSW AIR CANADA August 26, 2019 1 / 18

Ninth Montreal Industrial Problem Solving Workshop

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ninth Montreal Industrial Problem Solving Workshop

Ninth Montreal Industrial Problem Solving Workshop:Spilling problem

Ismael Assani, Poclaire Kenmogne, Jiliang Li, Gabriel Lemeyre, ThiThanh Hue Nguyen, Frédérique Robin, Pierre-Loïk Rothé

August 26, 2019

Under the supervision ofFrançois Bellavance (HEC) & Olivier G. Leblanc (Air Canada)

Ninth, IPSW AIR CANADA August 26, 2019 1 / 18

Page 2: Ninth Montreal Industrial Problem Solving Workshop

Outline

1 Context

2 Dataset building

3 ApproachesMachine LearningSurvival modelsKalman filtering approach

Ninth, IPSW AIR CANADA August 26, 2019 2 / 18

Page 3: Ninth Montreal Industrial Problem Solving Workshop

Context

The objective is to predict the spilling of a flight:↪→ Spilling flight definition: open to interpretation.

One proposition: the spill flight event is defined by the event:

“occupation rate 3 days before departure ≥ 0.95”.

The occupation rate is defined by

the number of bookingsactual airplane capacity .

Ninth, IPSW AIR CANADA August 26, 2019 3 / 18

Page 4: Ninth Montreal Industrial Problem Solving Workshop

ContextOne proposition: the spill flight event is defined by the event:

“occupation rate 3 days before departure ≥ 0.95”.The occupation rate is defined by

the number of bookingsactual airplane capacity .

Ninth, IPSW AIR CANADA August 26, 2019 4 / 18

Page 5: Ninth Montreal Industrial Problem Solving Workshop

Dataset Building and Features - 110 Origins and Destinations: AAA, BBB, CCC, ..., JJJ, KKK, LLL

Figure: Flights from 20 routes studied between 10 Origins and Destinations

Simplification: aggregating data by a unique flight index (TOD).↪→ longitudinal data (time series) per each flight over two years.

Ninth, IPSW AIR CANADA August 26, 2019 5 / 18

Page 6: Ninth Montreal Industrial Problem Solving Workshop

Dataset Building and Features - 2

Figure: Distribution of flights functions of Departure airport (Left), Destinationairport (Middle) and Departure hour (Right)

Ninth, IPSW AIR CANADA August 26, 2019 6 / 18

Page 7: Ninth Montreal Industrial Problem Solving Workshop

Machine Learning - Random forest - 1

Ninth, IPSW AIR CANADA August 26, 2019 7 / 18

Page 8: Ninth Montreal Industrial Problem Solving Workshop

Machine Learning - Random forest - 2

Data were stratified by cabinclass70-30 split between training andtesting data5-fold cross-validation usingcaret packageAverage of 93% of accuracyachieved for spill-detection

Ninth, IPSW AIR CANADA August 26, 2019 8 / 18

Page 9: Ninth Montreal Industrial Problem Solving Workshop

Machine Learning - Lasso, SVM, Gradient Boosting,logistic regression

Data were stratified by flightWe use Lasso to select features.Average of 80% of accuracyachieved for spill-detection forSVM, LG, Gradient Boosting

Ninth, IPSW AIR CANADA August 26, 2019 9 / 18

Page 10: Ninth Montreal Industrial Problem Solving Workshop

Machine Learning - ROC , AUC

Ninth, IPSW AIR CANADA August 26, 2019 10 / 18

Page 11: Ninth Montreal Industrial Problem Solving Workshop

Survival model approach - 1

Approach: Train a survival model to obtain a survival functionassociated with each unique Origin-Destination pair.The selected model is the Cox.This model allows us to predict the probability of survival accordingto certain flight characteristics.The characteristics retained are: the moment of the day, the day ofthe week and the week of the year.

Ninth, IPSW AIR CANADA August 26, 2019 11 / 18

Page 12: Ninth Montreal Industrial Problem Solving Workshop

Survival model approach - 20.

70.

80.

91.

01.

11.

21.

3

Moment in the day

Incr

easi

ng p

rob.

to s

pill

MORNING AFTERNOON

flight1flight2flight3flight4flight5

0.4

0.6

0.8

1.0

1.2

1.4

Day of the week

Incr

easi

ng p

rob.

to s

pill

1 2 3 4 5 6 7

flight1flight2flight3flight4flight5

05

1015

20week of the year

Incr

easi

ng p

rob.

to s

pill

1 5 9 14 19 24 29 34 39 44 49

flight1flight2flight3flight4flight5

5 flights charateristic

Ninth, IPSW AIR CANADA August 26, 2019 12 / 18

Page 13: Ninth Montreal Industrial Problem Solving Workshop

Survival model approach - 3

Application to one flight to predict the probability of spill 3 days beforedeparture knowing that we are 30 days from departure gives : predictionscore = 67.01%; MSE = 53.17%.

Low prediction capacity: But normal since the model does not takeinto account any other information.Can be use as feature engineering to improve another model.Possible improvement : add more relevant variables that may explainspill (eg: price range 30 days before departure).

Ninth, IPSW AIR CANADA August 26, 2019 13 / 18

Page 14: Ninth Montreal Industrial Problem Solving Workshop

Kalman Filtering - 1

Approach: Compute a forecast of plane occupation and conclude if it spillsor not

Historical data and measurements Occupation rate forecast Spilling Forecast

Principle:

Infer dynamic for the current booking:Use historical data to fit a polynomial regression

Modify dynamic to fit current measurements (data-driven approach):Use Kalman filtering to enrich the dynamic with current observations

Ninth, IPSW AIR CANADA August 26, 2019 14 / 18

Page 15: Ninth Montreal Industrial Problem Solving Workshop

Kalman Filtering - 2

0 25 50 75 100 125 150 175 200

0.0

0.2

0.4

0.6

0.8

1.0Kalman Filter Prediction Historical model (2017)Measurements (2018)

Figure: One flight occupation prediction using Kalman Filters and historical model(Polynomial degree: 5)

Ninth, IPSW AIR CANADA August 26, 2019 15 / 18

Page 16: Ninth Montreal Industrial Problem Solving Workshop

Kalman Filtering - 3

0 25 50 75 100 125 150 175 200

0.0

0.2

0.4

0.6

0.8

1.0

Kalman Filter Prediction Historical model (2017)Measurements (2018)

95% Occupation

Figure: One flight occupation prediction using Kalman Filters and historical model(Polynomial degree: 5)

Ninth, IPSW AIR CANADA August 26, 2019 16 / 18

Page 17: Ninth Montreal Industrial Problem Solving Workshop

Kalman Filtering - 4

Actual PredictedSpill occurrence rate 36% 40%

Figure: Results for a dataset of 11,307 flights

Prediction score: 73%False negative: 12%

Perspectives:Improving the historical dynamic modelMachine learning initial guess for new flight (without historical data)The Kalman filtering approach allows day to day update of theoccupation forecasting with minimal computational load

Ninth, IPSW AIR CANADA August 26, 2019 17 / 18

Page 18: Ninth Montreal Industrial Problem Solving Workshop

Acknowledgments

Thank for your attention !

Special thanks to Olivier, François, Caroline andFabian . . .

and Odile for the organizationNinth, IPSW AIR CANADA August 26, 2019 18 / 18