A Monte Carlo Framework at NLO EW and QCD

08/30/2022 # project # physics # monte-carlo-method

This article introduces an automated Monte Carlo framework constructed by Huanfeng. The framework is capable of computing the cross section (probability) of the processes in the Standard Model of particle physics at next-to-leading order (NLO) electroweak (EW) and QCD.

last updated: 02/19/2023

For the theoretical details and full results, please refer to the published paper.
The framework can be downloaded, intalled and used following the instructions here.

In A Big Picture, I briefly describe the theoretical background of particle physics, as well as the motivation and role of the constructed Monte Carlo framework. In Structure of the Framework, I discuss how the Monte Carlo framework is implemented and highlight its key components. I present two plots in Selected Results to showcase the feasibility of the framework. Some bottlenecks of the framework are revealed in Summary.

Free feel to skip the first part and take a look at what the framework looks like!

Big Picture
Structure of the Framework
Selected Results
Summary

Big Picture 🔗

What have we achieved? 🔗


FIG. 1: The Standard Model of particle physics. [CERN]

The Higgs boson was discovered in 2012 at the CERN Large Hadron Collider (LHC). Since then, the Standard Model (SM) of particle physics has been a complete model to describe the strong and electroweak interactions, which is represented in FIG. 1, i.e. matter particles include six quarks and six leptons, gluon ($g$) is the strong force carrier, $W^{\pm}$, $Z$ bosons and photon ($\gamma$) are the electroweak force carriers.

However … 🔗

There are still many open questions that cannot be addressed by the SM, such as the nature of dark matter, the origin of neutrino oscillation, the gravity etc. In such a ‘post-Higgs-boson-discovery’ era, the LHC is one of the places we can search for evidences of new physics.

How can we approach? 🔗

Two strategies:

Direct search: production/resonance of new particles. But we don’t know exactly the energy scale at which new particles can be produced, if it’s above multi-TeV, we won’t be able to produce them in any near future.
Indirect search: small deviations from the SM predictions due to the underlying new particles interacting with the known particles.

It is really a choice. Direct search requires the beyond-the-Standard-Model (BSM), i.e. some new theory. Indirect search requires precision calculations in the SM.

Surely, these two strategies are not totally independent.

I follow the strategy of indirect search and focus on one window, 🔗

Triple electroweak gauge boson production (triboson production), because

It is one of the least tested processes, especially for the massive triboson production, the measurements have just been reachable since LHC Run 2.
It has very rich decay products which can be the backgrounds of some other SM processes and BSM searches.
It manifests triple and quartic gauge couplings. We can not only verify the non-Abelian gauge structure predicted by the SM, but also seek for the anomalous gauge couplings.

You may imagine a triboson production shown in FIG. 2 where $W^{+}$, $Z$ bosons and a photon $\gamma$ are produced and further decay into leptons.


FIG. 2: A sample diagram of a triboson production with leptonic decays.

Structure of the Framework 🔗

What is the framework computing? 🔗

It computes the hadronic cross section of the processes at the CERN Large Hadron Collider (LHC). The “cross section” indicates the “probability” of a process happening in proton-proton collisions at certain energy, and can be expressed as $$\sigma(p_1,p_2)=\sum_{a,b}\int^{1}{0} dx_1 dx_2 f_a (x_1,\mu_F) f_b(x_2,\mu_F) \hat{\sigma} (p_a,p_b,\mu_F,\mu_R),$$ where the sum is taken over all possible combinations of partons $a,b$ with momenta $p{a,b}$
determined by the fractions $x_{1,2}$ of the protons' momenta $p_{1,2}$. $f_{a,b}$ are the parton distribution functions (PDFs) depending on the momentum fractions $x_{1,2}$ and the factorization scale $\mu_F$. $\hat{\sigma}$ denotes the partonic cross section which in case of NLO QCD also depends on the renormalization scale $\mu_R$.

Despite of the compact form, there is a subtlety, 🔗

the partonic cross section $\hat{\sigma}$ contains sigularities in theory. In order for the numerical computations, it has to be regulated as $$\hat{\sigma}=\int_{2\to 5}d\sigma^{\text{B}} + \int_{2\to 5} \left( d\sigma^{\text{V}} + d\sigma^{\text{I}} \right) + \int_{2\to 6} \left( d\sigma^{\text{R}} - d\sigma^{\text{A}} \right) + \int_{2 \to 5}d\sigma^{\text{C}}.$$ Here, $\text{B}$ denotes “Born level”, i.e. FIG. 2. $\text{V}$/ $\text{R}$ denote “virtual”/“real” corrections, i.e. an additional particle is emitted internally/externally based on $\text{B}$. Besides, $\text{I}$, $\text{A}$ and $\text{C}$ are the so-called “dipole” terms which make all three integrals finite, respectively. The subscript, e.g. $2\rightarrow 5$, means the number of incoming and outgoing particles for a considered process, i.e. FIG. 2. The real corrections $\text{R}$ have an extra external particle.

The Monte Carlo Framework is implemented as 🔗


FIG. 3: The structure of the Monte Carlo framework.

The Monte Carlo framework utilizes various tools to compute different components as shown in the expression of $\hat{\sigma}$. Specifically, the integrands are calculated using the following tools:

$\sigma^\text{B}$ (leading-order), $\sigma^\text{V}$ (virtual 1-loop), $\sigma^\text{R}$ (real radiation): RECOLA,

$\sigma^\text{I}$ (integrated dipoles), $\sigma^\text{A}$ (differential dipoles), $\sigma^\text{C}$ (collinear counterterms): MadDipole,

$f$(parton distribution functions in the expression of $\sigma$): LHAPDF.

For the numerical integrations, VEGAS algorithm is applied, which adopts the importance sampling to reduce the variance.

Due to the high multiplicity of the process, it is necessary to sample a large amount of events, e.g. ~10M for $\sigma^{\text{B}}$, ~1M for $(\sigma^{\text{R}}-\sigma^{\text{A}})$, 5K for $\sigma^{\text{V}}$, to achieve numerical stable results. Parallel computations are inevitable, especially for NLO EW corrections where evaluating one event for $\sigma^{\text{V}}$ at NLO EW takes ~15 secs on a 2.3 GHz Intel Core i5.

I perform the parallel computations on the UB HPC cluster by using up to 1000 cores.

Selected Results 🔗

Here I show two sets of results and describe how they are related to indirect search of new physics qualitatively.

Indirect search requires precise predictions.
- FIG. 4 presents the predictions for an invariant mass distribution. In particular, it includes the impact of NLO EW corrections (red and orange lines).
- Such corrections change the shape of the distribution, which is signaled by the red line escaping from the blue band in the lower panel.


FIG. 4: LO and NLO predictions for an invariant mass distribution.

Indirect search requires a model, a popular one is called the Standard Model Effective Field Theory (SMEFT).
- If any deviations are detected between the data and predictions, the SMEFT can be used to model the deviations and provides hints about how new physics could be like.
- To reduce the false positive deviations (signals), precise predictions are the keys.
- For instance, as illustrated in the lower panel of FIG. 5, the NLO EW corrections (difference between the blue and orange bands) are mimicking the LO SMEFT effects (difference between the zero dashed line and the forest or pink line) in size.


FIG. 5: LO, NLO and LO SMEFT for an invariant mass distribution.

Summary 🔗

Perhaps it is a bit strange to give a summary of such a short, shallow and skippy description on my PhD project. So let me confess that the Monte Carlo framework is far from perfection:

It is all right if the cluster is used, but the computational performance is not good, mainly because the framework is general (optimization is very likely given that specific processes are preferred).
It is an automated framework in a sense that the process is fixed. However, manual adjustments are still required if the interested process is changed. A complete automation is doable. The difficulty is how to smoothly interact and reorganize various tools.
Is it possible to design a new algorithm or ultilize machine learning techniques to replace the legacy VEGAS? The importance sampling is not satisfactory for high dimensional (e.g. 16-dimensional integration for $(\sigma^{\text{R}}-\sigma^{\text{A}})$, see above) and non-diagonal (e.g. $\sigma^{\text{V}}$ at NLO EW) case.