Why SoFi is the best source apportionment software

While there is EPA PMF and PET, SoFi is by far the most advanced PMF software with a myriad of features to support the pre-analysis, the PMF calculations and the post-analysis. We not only offer a fast, user-friendly software but also support in the analysis. Here are some highlights why SoFi is the best software when it comes to source apportionment analysis.

Data import

supported input files– PMF input: .itx, .xls(x)
– external time series/profiles: .itx, .xls(x), .csv, .dat, .txt
– PMF input: .xls(x)
error matrix -import as matrix
– if not available: automatic error matrix calculation according to Polissar et al. (1998) or Norris et al. (2014)
-import as matrix
– if not available, list of variables, detection limit and percent uncertainty (Norris et al. (2014)
averaging to desired time stampData and error matrix can be averaged with one click into any lower time resolutionhas to be done separately, no support in EPA PMF

In addition, there are a number of instrument specific settings. For example, SoFi can exclude CO2 related variables for Aerodyne instruments automatically. Also, SoFi offers options for AE33 data files treatment.


SoFi offers a variety of options to inspect the data before running PMF. All the graphs are fully customizable and come in a paper-worthy quality. With SoFi, you can also compare your data with external data. For instances, scatter plots of variables (absolute or fractions) vs externals, S/N of the whole matrix or just for selected ions for each time point and on average (e.g. EPA PMF only gives the average S/N for each variable). Furthermore, the user can define classes over time and profiles.

The external data does not have to have the same time resolution or same variables. SoFi automatically extracts, averages or interpolates the external data (time series and profiles) to fit the time resolution and time points as well as the variables of the PMF input, no additional software needed.

PMF calculations/preparations

Currently, SoFi and EPA PMF both use the ME-2 solver to perform the PMF calculations. However, in SoFi the user has more control over the settings. A short list of different set-ups for running PMF:

matrix preparation for the ME-2 solverempty rows, columns can be easily removed, single cells replaced with missing value indicator, as required for the ME-2 solvermissing values can be replaced with missing value indicator (exclude species) or replaced by median
blacklist (exclude) dataUsers can easily define classes over time series or variables. Option for defining blacklisted points: single time points/variables, indices-based for single or bulk, bulk selection via marquee, based on external list, based on certain months, days (single days but also weekend/weekdays), hours and minutes.blacklist by setting variables as “bad variables” or exclude time points
downweighting of variables– for the whole data matrix: Step-wise (customizable values for weak and bad variables) or as a function of 1/(S/N), either based on the average S/N or cell-wise (single time point) S/N

– single variables can be downweighted with customizable values
possible, but manually for each variable and at one level for all variables (downweighting by 3)
number of factorsrunning (repeatedly) PMF over a range of factorsrunning (repeatedly) PMF over just one number of factors
seed settingrandom (random for each PMF run), pseudo-random (random for first PMF run, all following will then use the same seed) and based on referencerandom, but only for first run, then the others will use the same value (“pseudo-random” in SoFi)
constraintsTime series or profiles can be constrained with external data. Random and sensitivity-controlled constraints are easy to set up:
– a-value technique (random, exact, sensitivity analysis, ME-2 limits)
– pulling equations
– fpeak
– Pre-defined set of constraints. Low control and highly time consuming for application of many constraints
bootstrappingunblocked, block-wise and user-controlled (monthly, daily, hourly, user-defined groups)block-wise
rolling PMF/moving window approachfully automated, easy to combine with random a-value technique and bootstrappingwindows have to be prepared manually
C-value approach (relative error scaling)user-specific C-value easy to set, specific residual analysis to compare the residualsnot available
multi-time PMFautomatically prepared PMF input matricesnot available
control over the convergence criteriayesno
PMF run continuation after computer crashedcontinue after the last fully performed PMF run before computer shut-downstart from the beginning
saving the settings, configuration file, sharing resultsThe user can save SoFi at any point in the analysis. All graphs, settings and PMF results are saved.

PMF runs are saved in HDF5 files and can be easily shared. Results can be opened from any experiment on any computer and all data and settings are available.
configuration file only saves path to data and input data/downweighting but not any results or graphs. If closed, runs have to be repeated

While EPA is made to treat relatively small datasets, SoFi can also deal with larger, mega-sized matrices. Furthermore, with the rolling mechanism there is no size-limitation, as the rolling windows are per definition a fraction of the total. Moreover, SoFi will be the only software to work with the new R-PMF solver. R-PMF is a new PMF solver, developed by Dr. Pentti Paatero, designed for giga-sized matrices.


SoFi allows for a huge variety of post-analysis features. Besides a wide range of graphical options, with SoFi you can look at many PMF solutions and the average thereof. With EPA PMF, one can only look at one PMF solution at a time. There is not the option within EPA PMF to average PMF solutions. With SoFi on the other hand, one can average PMF solutions (a few up to several thousands). The user has full control over all solutions by defining criteria for environmentally reasonable solutions.

Criteria panel

Running bootstrapped PMF or rolling PMF (especially in combination with the random a-value approach and the bootstrapping technique), the user will end up with thousands of PMF runs. Not all of these runs will be environmentally reasonable solutions. However, the manual inspection is with such a number of PMF runs not possible anymore. Therefore, we developed the criteria panel. With the use of user-defined criteria, it is possible to extract good solutions in a short period of time. A selection of available criteria:

  • Average: intensity of single variables, concentration of solution or variables, cycles, explained variation and many more. Mathematical expression and combinations are also possible. Also correlations between solution and externals, variables, etc. or multilinear regressions.
  • Image: image plots for better understanding for e.g., diurnal patterns over the solutions
  • Movie: scatter plots that can be scrolled through (like a movie)

Graphical options

EPA only offers basic, non-customizable graphs. SoFi on the other hand offers a huge number of graphs. They are fully customizable and in publication quality. Some features that are exclusively available in SoFi:

SoFi is a user-friendly software for PMF that offers a myriad of fully-customizable graphs.
  • “Normal” plots: Profiles, time series (absolute and fraction), cycle plots (diurnals to yearly plots), scatter plots, pie charts, bar plots, correlation matrices. A huge selection of possible parameters to plot: solution, variable, variable of solution, explained variation, residuals, external data, multilinear regression, model subtratction (ACSM only), HR family plot (HR-AMS), … in short any kind of plot.
  • HR-analysis: elemental ratio, mass defect, van Krevelen and carbon number distribution plots. Easily create user-defined families for better understanding your data.
  • Averages: SoFi calculates daily, weekly, monthly and yearly averages for (single) PMF solution. Furthermore, with SoFi several (few to thousands) of PMF solutions can be averaged together to have a better understanding of the “real” PMF solution (see also above under criteria panel).
  • Plot based on classes/time: Make use of user-defined classes over time or profiles or select certain hours/days/months/years to inspect parts of the data to understand certain trends better.
  • (Un)exlpained variation: In SoFi, the user also has access to the unexplained variation. It is subdivided into real (from the measurements) and noisy (from the PMF) unexplained variation.
  • Residual analysis: In SoFi, the user has access to the residual, absolute residual, scaled residual, absolute scaled residual and Q for each time point and variable, respectively. Furthermore, the residuals can be looked at as histogram or one can scroll through the full residual matrix.
Advanced additional analysis
  • Wind analysis: SoFi also offers wind and pollution roses as well as probability and NWR polar plots. Furthermore, SoFi is optimized to export data to ZeFir for backtrajectory analysis.
  • Marquee panel: It has never been so easy to compare data. The marquee panel lets you select data (based on defined limits, marquee or classes) and applies that selection on other open graphs. E.g., you see an interesting trend in a scatter plot, select that data and get more statistics on just that period, cross-check how other variables behave during that time, get an updated pie chart or scatter plot just for that period. Possibilities are endless.
  • Bootstrap panel: Ever wondered how stable your correlations are? Bootstrap your data and get an idea on the mean and spread values.
  • Clustering panel: SoFi allows to cluster solutions and variables (also before PMF) to better understand the data. Silhouette plots give an idea about the goodness of the clustering. This can also be used to select environmentally reasonable solutions.


We at Datalystica provide full technical and reasonable scientific support for SoFi and we are happy to help with software/PMF questions. EPA does not have the ability to provide user support nor can EPA provide troubleshooting support.


We provide regular updates for SoFi Pro/RT (our two software packages). The updates not only include bug fixes but regularly contain new features to facilitate the PMF analysis. SoFi is always up to date and SoFi is therefore compatible with the newest Igor and windows versions. EPA PMF on the other hand was tested for the Windows versions 7 to 10 and is currently not updated to a newer operating systems.


EPA PMF is free to use, while SoFi is based on a yearly license system. Nevertheless, SoFi facilitates and enormously speeds up the analysis through its extreme easy handling, generates robust results and allows for a thorough investigation of the data. And if you are stuck at some point, we are always here for you with our rapid support!

So in summary, we SoFi is the best PMF software that is currently available

Note: all information about EPA PMF is from their website, their manual or publications as of October 2022.

white jigsaw puzzle illustration

Check out our knowlege base for more information on SoFi