Towards Better Understanding of Factors Influencing the QoE by More Ecologically-Valid Evaluation Standards

Project facts

Project promoter:
AGH University of Science and Technology(PL)
Project Number:
PL-Basic Research-0027
Status:
Completed
Final project cost:
€918,157
Donor Project Partners:
Norwegian University of Science and Technology(NO)
Programme:

More information

Description

Our goal is to better understand the process of interacting with a video service, from a macro-temporal, interdisciplinary perspective. We create a description (potentially a hierarchy) of the factors influencing QoE. We also propose a new model describing interdependencies between the factors. One such factor is how much a person is interested in the consumed content. The model would place this factor in relation to other factors and assign it a weight. Apart from the model we also target to create a new standard for QoE evaluation. This translates into appropriate subjective testing procedures allowing to operationalize the identified factors. Importantly, we address the question of how to narrow the gap between lab experiments (high internal validity, low ecological validity) and field studies (high ecological validity, low internal validity)? To achieve the goals we explore a set of subjective testing methods with different degrees of ecological validity. Specifically, we run 15 subjective experiments with 460 participants in total. One of the studies – targeting how QoE evolves over time - is longitudinal and lasts for 30 consecutive months. Such a long study is unprecedented and constitutes an important milestone for the field. Conducting the subjective studies in two countries also offers important advantages in view of validation of proposed measures and methodologies.  This project provides a chance to achieve a holistic, well-tested view on the QoE. It also works towards more ecologically valid subjective testing methodologies. Both of those translate into better understanding of human behaviour (and how QoE relates to behaviour). The results do not only carry the potential of commercialisation but, more importantly, offer the prospect of exploiting QoE towards better QoL.

Summary of project results

The TUFIQoE project addressed a fundamental issue in multimedia streaming: the limitations of existing Quality of Experience (QoE) assessment methods to accurately capture viewer experiences under dynamic, real-world conditions. Conventional subjective experiments often restrict interaction with the service to viewing or hearing a brief sample, resulting in low mundane realism—that is, a lack of resemblance to real-world settings. This limited realism can, in turn, reduce the external validity of these experiments. Additionally, people tend to discuss content more than quality itself, suggesting that conventional subjective experiments may not fully capture the nuances of quality as experienced in everyday life. TUFIQoE tackled these challenges by developing new subjective experiments that integrate real-life user interactions with multimedia content, such as YouTube and Netflix, while also considering a range of influencing factors like social interaction, psychological influences, and the effects of time.

The project also emphasized the importance of including underrepresented demographic groups, such as elderly users, to build a broader understanding of QoE factors that extend beyond young, tech-savvy populations. These goals provided a strong foundation for advancing QoE research, enabling it to more accurately reflect user satisfaction in real-world multimedia consumption.

TUFIQoE undertook extensive experimental activities and developed innovative tools to achieve the project’s objectives. The team conducted numerous subjective experiments, each distinct from conventional approaches. These ranged from slight modifications to traditional methods, like ACR Without the Scale, to entirely new experimental designs, such as Fix Your Netflix, which allowed participants to interact naturally with the service—selecting content, viewing entire sequences, and experiencing settings designed to mimic a home environment (achieved in Your Netflix, Our Lab). Various experiments focused on psychological aspects, including quality estimation for emotionally charged content and comparisons of quality evaluations when participants watched alone versus with a friend. The most challenging experiment explored how time affects quality scores by collecting scores in two ways: through single-session lab studies and through a longitudinal design where participants rated one sequence daily, scoring the entire week’s experience after repeated exposures separated by days.
In parallel with these new subjective experiments, the project developed a theoretical foundation to interpret results and guide the design of future experiments. This theoretical approach bridged psychology and technical insights into content delivery, utilizing causal modeling theories to move beyond correlation and understand causal relationships. These models provided a clearer view of the limitations of conclusions drawn from specific experiments, enabling more reliable insights.
Additionally, we advanced the understanding of how to model subjective quality scores, which is crucial for comparing our new methods with conventional subjective experiments and requires precise statistical descriptions. For each experimental design proposed, we developed appropriate protocols and accompanying software to ensure rigor and consistency in methodology.

The most important result of the TUFIQoE project is the finding that results from conventional subjective experiments generally correspond well to those from experiments with higher mundane and psychological realism—provided the focus is solely on quality. However, conventional experiments fall short when it comes to measuring additional factors, such as viewing content with friends or watching emotionally engaging material. This has significant implications for the QoE community. On one hand, it confirms that both academia and industry are using appropriate tools to measure quality. On the other hand, if we aim to consider factors beyond quality, a new approach is necessary.

The project proposed such an approach by conducting experiments that do not explicitly ask about quality. Instead, these experiments rely on user behavior to gauge experience. Further study is needed to understand the differences between conventional experiments that focus on quality ratings and those based on natural user behaviors, as these methods may yield distinct insights.

A separate set of results relates to long-term studies. We demonstrated that laboratory studies cannot reliably approximate long-term experiences. To address this, we proposed a model that describes quality over a long-term interaction, a unique and challenging accomplishment given the difficulty of conducting such experiments repeatedly. Long-term quality assessment is particularly valuable for industry, as strategic decisions—such as whether users stay with or leave a service—are often based on extended user interactions.

The direct beneficiaries of the TUFIQoE project include companies and their clients in the streaming industry, as they gain a better understanding of the limitations of models derived from conventional subjective experiments. Ultimately, all users benefit from improved service quality. Finally, the research community gains valuable evidence underscoring the importance of developing new methods that move beyond conventional subjective experiments to achieve a deeper understanding of user behavior.

Summary of bilateral results

The cooperation was satisfactory. Seven joint works have been published within the project. The consortium represented complementary skills. PI’s team provided insights from the engineering-focused perspective. In particular, it represented expertise in system-level (i.e., network and visual signal level) indicators and their correlation to the QoE . On the other hand, NPI’s team broght a perspective focused on human behaviour. The contractors from AGH and NTNU have participated together in several important events at the intersection of business and science. A joint application was submitted in 2021, resubmitted in 2022 and in 2023 (currently under evaluation).

Information on the projects funded by the EEA and Norway Grants is provided by the Programme and Fund Operators in the Beneficiary States, who are responsible for the completeness and accuracy of this information.