IEEE COMMUNICATIONS LETTERS, VOL. 15, NO. 10, OCTOBER 2011
Modeling Web Browsing on Mobile Internet
Gou-feng Zhao, Qing Shan, Shasha Xiao and Chuan Xu
Abstract—In this letter, we present a Web browsing model for mobile Internet. Instead of relying on simulation data, our approach is based on real data. We set up our model with six parameters including main-object size, embedded object size, number of embedded objects, inter-arrival of embedded objects, reading time and session duration. By using KS-test method, we ﬁt the measurement results to matched statistical distributions. Compared to previous Web studies, our results are new ﬁndings. Index Terms—Mobile Internet, web ...view middle of the document...
Tams Varga et al.  found that session arrivals and reading time can be well modeled with a Poisson arrival process and logarithmic type of distribution respectively. In , a Web browsing model is presented, including main-object size, embedded object size, number of embedded object per page, embedded inter-arrival time. However, most research data are from simulations on wired network environment, not the real data from mobile Internet, the convictive power of these researches are not strong enough. In this letter, based on real data collected for a week from a WAP gateway of one main Mobile Telecom Carriers in
ChongQing Province of China, where more than 30 million people inhabit, we have constructed a model for Web browsing on mobile Internet. It consists of six parameters including main-object size, embedded object size, number of embedded objects, embedded object inter-arrival time, reading time, and session duration. The Kolmogorov-Smirnov goodness-of-ﬁt (KS) test was introduced to investigate whether a speciﬁc distribution was a good ﬁt to the empirical distribution for a measured data set . As same as former studies, we use KS method to select the best ﬁtted distribution for our model parameters based on the data set. We found that our results vary signiﬁcantly from those previous studies. The rest of the paper is organized as follows: Section 2 introduces our data set. Section 3 presents the results of modeling Web browsing on mobile Internet. We conclude the paper in Section 4. II. DATA E LABORATION Our raw data was collected from a WAP gateway of one main Mobile Telecom Carriers in Chongqing province, China, for a week from Apr. 5, 2010 to Apr. 11, 2010. The log ﬁle has a total size of 130 GB and contains 17,316,616 records. Based on the dataset, we found the most popular services on mobile Internet are game, chat and Web browsing, each taking nearly 1/3 of the total trafﬁc. This letter focuses on Web browsing. There are 9,456,475 valid Web requests, 42.7% of them are main-object requests and 57.3% of them are embedded object requests. Every record contains the information of Time, Calling Number, Client IP Address, User Agent, URL, Content Type, Domain, In-Status, Uplink and Downlink length etc. For example, the object size is recorded in Downlink ﬁeld which can be used to calculate the main-object size and embedded object size. In Web-request scenario, generally, there are two kinds of objects: main object and embedded object. In our dataset analyses, main object points to the ﬁle that contains an HTML/XML document; embedded object is the ﬁle linked from the Hypertext document, including formats such as CSS, PNG, GIF, JPG, BMP, FLV, RAR, EXE, MP3, ZIP, SWF, ISO, ICO, JS etc. III. M ODELING W EB B ROWSING ON M OBILE I NTERNET In this section, we present our model for Web browsing on Mobile Internet through six factors. For each factor, we ﬁrst show and analyze the measurement result, then ﬁt the result by KS test...