Data#
We share four datasets as examples for one to apply the pretrained NNE. You can find them in Matlab files in the “sample_data” folder at this GitHub directory. These datasets are used in Wei and Jiang (2025) and come from sources with direct public access. More detailed descriptions of these datasets can be found in the paper too.
Description of the datasets#
Expedia - destination 1#
This dataset comes from a Kaggle contest that studies hotel searches and bookings on Expedia.com. It has been used in several papers to study consumer online search behaviors. This dataset here consists of the search sessions for the largest travel destination in this contest. There are \(n\) = 1258 sessions. There are 3 product attributes, 2 consumer attributes, and 1 advertising attribute.
Expedia - destination 2#
This dataset consists of the search sessions for the second largest travel destination in the same contest as above. There are \(n\) = 897 sessions, slightly smaller than required by the pretrained NNE (see code). Nevertheless, in this case the pretrained NNE seems to work OK despite this small shortfall.
Trivago - desktop channel#
This dataset comes from the ACM RecSys Challenge that analyzes user sessions on Trivago.com. This dataset here consists of the search sessions made on the desktop channel. The setting of Trivago (a meta-search engine) does not exactly fit the standard sequential search model. Nevertheless, we find it a good place to try out the pretrained NNE.
Trivago - mobile channel#
This dataset consists of the search sessions made on the mobile channel from the same source as above.
Papers#
Wei, Yanhao ‘Max’ and Zhenling Jiang (2025), “Pretraining Estimators for Structural Models: Application to Consumer Search.”