Title: Modeling methods for residential energy consumption and user behaviors of online-targeted display advertising
Authors: Guan, Jingjing (關菁菁)
Abstract: This dissertation presents modeling methods for two research problems:
residential energy consumption (REC) and user click-through behaviors towards
online-targeted display advertising (OTDA).
Understanding REC via modeling household-level survey data is
important for governments, energy corporations, and home appliance
manufacturers. National residential energy consumption surveys (RECS) collect
household-level data with stratified random sampling schemes. RECS data,
consequently, have a natural and explicit multilevel structure, reflected by
geographical clustering of households. To handle this multilevel structure, we
introduce a multilevel regression model. This approach divides overall REC
variations into two sources: area and household-level variations; and respectively
explains them with environmental and household-level variables. With a 53%
explained variance proportion (EVP) (82% of area variations and 47% of
household-level variations); the proposed multilevel regression model
outperforms previous models, e.g. traditional linear regression models.
Furthermore, the multilevel regression model largely reduces the impact of area
variations on the estimated effects of influencing factors of REC.
We introduce regularized linear models (RLMs) with the elastic net
penalty to model REC. The purpose is to identify significant factors among all
utilizable variables from RECS micro-datasets. This approach imposes no
antecedent theory on variable selection. The elastic net penalty is a compromise of
the ridge-regression and the LASSO penalty. It helps to handle more than 500
RECS variables of complicated multicollinearity in one model. With the U.S.
2009 RECS dataset, we develop a series of RLMs with the elastic net penalty. All
constructed RLMs simultaneously assign non-zero effects to 98 selected variables.
The best-fit RLM, which explains 65% of the total variability, outperforms most
previous models in the literature.
OTDA, as a new mode of online display advertising, has developed
rapidly due to its capability to target potential customers. This dissertation
addresses the issue from the perspective of OTDA publishers. As many
management problems inherently involve optimization and statistical modeling, we develop a two-step forecasting method to forecast user click-through behaviors
towards OTDA, so as to control uncertainties in formulating allocation plans for
OTDA resources.
We introduce a finite mixture regression model, i.e., an arbitrary-pointsinflated
(API) Poisson regression model as a foreshadowing. With an offset in the
Poisson component, this model can handle count data with an arbitrary number of
inflated points and link clicks with page views. We develop algorithms for
parameter estimation, adaptively choosing the best-fit API Poisson regression
model according to the Bayesian information criterion (BIC), and obtaining the
corresponding Hessian matrix.
The two-step forecasting method involves a modeling and a predicting
step. It can forecast user clicks towards matches of advertising requests and
candidate allocation plans, based on data observed in current period. The
modeling step segments data in current period into sub-samples with an adequate
number of sample sizes, and constructs sub-models using an adaptive API Poisson
regression algorithm. The predicting step provides two predicting schemes, and
selects the scheme with higher per campaign prediction accuracy as the final
scheme, to forecast user clicks in next period. Moreover, the proposed method is
of fast computing speed and robust parameter estimation.
We adapt this two-step forecasting method to forecast user clicks towards
OTDA for a social network site. The empirical results show that our approaches
have higher accuracy than other previous methods, including logistic regression,
truncated logistic regression, etc. The ensemble-predicting scheme achieves
higher accuracy in forecasting non-zero clicks, compared to the campaign-tocampaign
predicting scheme. The model involving page views possesses the
smallest prediction error among all alternative models being considered. Finally,
we present a brief discussion on forecasting page views and suggest a further
extension of the API Poisson regression to model count data other than Poisson
distribution.
Notes: CityU Call Number: HB849.49 .G83 2014; xviii, 197 p. : ill. 30 cm.; Thesis (Ph.D.)--City University of Hong Kong, 2014.; Includes bibliographical references (p. 177-195)
↧