View Categories

Uncertainties in PMF

4 min read

It helps to be precise about where uncertainty actually comes from in a PMF analysis, because the sources are distinct and require different approaches to characterize.

Measurement uncertainty is the most familiar. It reflects the analytical precision and accuracy of the input data. This is quantified during data preparation and fed directly into PMF as the uncertainty matrix. It is the most tractable source of uncertainty, and most studies handle it reasonably well.

Rotational ambiguity is less well understood in practice. PMF does not produce a unique solution. The objective function, i.e. minimizing Q, is satisfied by a family of solutions, related to each other by rotational transformations. Exploring this space rigorously is one of the more consequential methodological choices in a PMF analysis.

The traditional approach uses the fpeak parameter to navigate rotational space, but fpeak has a meaningful limitation: it applies a single scalar shift across all factors simultaneously. This makes it a blunt instrument. You cannot use it to selectively adjust the rotation of one factor without affecting all others, which makes targeted exploration of ambiguous source separations difficult to control.

A more powerful alternative is the random a-value approach, implemented in SoFi. Rather than applying a uniform rotational shift, random a-values introduce factor-specific perturbations, allowing the rotation of individual factors to vary independently. Running PMF across a large ensemble of random a-value combinations samples the rotational solution space far more thoroughly than fpeak exploration alone. The resulting spread in source profiles and contributions gives a much clearer picture of where the solution is well-constrained and where genuine rotational ambiguity remains.

Model error is the hardest to address. It reflects the extent to which the underlying PMF model is actually an appropriate representation of your source mixture. In most ambient air quality applications, it is a reasonable approximation, but it remains an approximation, and systematic patterns in the model residuals can indicate where that approximation breaks down. Residual analysis is an underused diagnostic: species with large or structured residuals often point to missing sources, poorly resolved factors, or non-linear mixing that the model cannot capture.

Strategies for capturing the uncertainties

Given that uncertainty in PMF results stems from three distinct sources, addressing it properly requires a corresponding set of approaches, no single method covers all three.

Bootstrap resampling is the most widely used uncertainty estimation approach, and addresses sensitivity to sampling variability. By repeatedly sampling from your data with replacement and re-running the model, it characterizes how stable your source profiles and contributions are across the natural variability in the dataset. Bootstrap uncertainty, however, should not be interpreted as total solution uncertainty. The bootstrap procedure evaluates sensitivity to the dataset itself. It does not systematically explore the rotational solution space surrounding each solution, nor the model error.

Rotational ambiguity requires a separate strategy. The random a-value approach addresses this by perturbing the elements of the factor profiles and contributions within allowable rotational constraints and repeatedly re-solving the PMF system across a large ensemble of randomized configurations. In practice, this samples the accessible rotational space far more thoroughly than directed FPEAK sweeps, which are often difficult to control and may only explore a limited subset of possible rotations.

Used together, bootstrap resampling and random a-value exploration provide a substantially more complete characterization of PMF solution stability than either method alone.

A third component of a rigorous uncertainty strategy enters earlier in the workflow, during data preparation. The uncertainty matrix supplied to PMF is not merely descriptive metadata; it actively determines the weighting structure of the model fit. Its construction therefore has direct consequences for factor resolution and interpretability. SoFi supports a configurable set of tools for uncertainty matrix refinement, including error estimation procedures, species down-weighting, sample and variable blacklisting, and LOD-based exclusion strategies. None of these operations are universally required, but all are available where justified by the characteristics of the dataset. The critical point is that these choices should be made deliberately and transparently, rather than inherited as unexamined defaults.

Model error remains the most difficult uncertainty component to quantify directly. Residual diagnostics provide at least a structured means of identifying where the model representation begins to fail. Systematic residual patterns often indicate unresolved sources, poorly separated factors, temporal variability in source profiles, or non-linear processes that violate the assumptions of the bilinear model itself. A complete quantification of model error is rarely tractable in practice, but recognizing its existence remains an essential part of rigorous PMF interpretation.

Powered by BetterDocs