Processing eROSITA data

This tutorial will teach you how to use DAXA to process raw eROSITA data into a science ready state using one line of Python code (or several lines, if you wish to have more control over the settings for each step). This relies on there being an initialised (either manually before launching Python, or in your bash profile/rc) backend installation of the eROSITA Science Analysis Software System (eSASS), including accessible calibration files - DAXA will check for such an installation, and will not allow processing to start without it.

All DAXA processing steps will parallelise as much as possible - processes running on different ObsIDs/instruments/sub-exposures will be run simultaneously (if cores are available)

Import Statements

[1]:
from daxa.mission import eRASS1DE, eROSITACalPV
from daxa.archive import Archive
from daxa.process.simple import full_process_erosita
from daxa.process.erosita.clean import flaregti
from daxa.process.erosita.assemble import cleaned_evt_lists

An Archive to be processed

Every processing function implemented in DAXA takes an Archive instance as its first argument; if you don’t already know what that is then you should go back and read the following tutorials:

  • Creating a DAXA archive - This explains how to create an archive, load an existing archive, and the various properties and features of DAXA archives.

  • Using DAXA missions - Here we explain what DAXA mission classes are and how to use them to select only the data you need.

Here we create an archive of eRASS DR1 and eFEDS observations of the eFEDS cluster with identifier 339:

[2]:
ec = eROSITACalPV()
ec.filter_on_positions([[133.071, -1.025]])
er = eRASS1DE()
er.filter_on_positions([[133.071, -1.025]])

arch = Archive('eFEDSXCS-339', [ec, er], clobber=True)
/mnt/pact/dt237/code/PycharmProjects/DAXA/daxa/mission/base.py:1075: UserWarning: A field-of-view cannot be easily defined for eROSITACalPV and this number is the approximate half-length of an eFEDS section, the worst case separation - this is unnecessarily large for pointed observations, and you should make your own judgement on a search distance.
  fov = self.fov
Downloading eROSITACalPV data: 100%|████████████████████████████████████████████| 1/1 [00:24<00:00, 24.33s/it]
Downloading eRASS DE:1 data: 100%|██████████████████████████████████████████████| 2/2 [00:01<00:00,  1.10it/s]

One-line solution

Though we provide individual functions that wrap the various steps required to reduce and prepare eROSITA data, and they can be used separately for greater control over the configuration parameters, we also include a one-line solution which executes the processing steps with default configuration.

We believe that the default parameters are adequate for most use cases, and this allows for users unfamiliar with the intricacies of eROSITA data to easily start working with it. Executing the following will automatically generate cleaned combined event lists for all telescope modules selected upon the original declaration of the mission.

fm00_300007_020_EventList_c001.fits

[3]:
full_process_erosita(arch)
eROSITACalPV - Finding flares in observations: 100%|████████████████████████████| 2/2 [00:11<00:00,  5.91s/it]
eRASS DE:1 - Finding flares in observations: 100%|██████████████████████████████| 2/2 [00:02<00:00,  1.20s/it]
eROSITACalPV - Generating final event lists: 100%|██████████████████████████████| 2/2 [01:01<00:00, 30.75s/it]
eRASS DE:1 - Generating final event lists: 100%|████████████████████████████████| 2/2 [00:02<00:00,  1.07s/it]

General arguments for all processing functions

There are some arguments that all processing functions will take - they don’t control the behaviour of the processing step itself, but instead how the commands are executed:

  • num_cores - The number of cores that the function is allowed to use. This is set globally by the daxa.NUM_CORES constant (if set before any other DAXA function is imported), and by default is 90% of the cores available on the system.

  • timeout - This controls how long an individual instance of the process is allowed to run before it is cancelled; the whole function may run for longer depending how many pieces of data need processing and how many cores are allocated. The default is generally null, but it can be set using an astropy quantity with time units.

Breaking down the eROSITA processing steps

There are fewer individual steps for eROSITA compared to a telescope like XMM - this reflects its simpler design, with only a single camera type, as well as the differences in how data are served to the community and the backend software design. This section deals describes the different processing steps that DAXA can apply to eROSITA data (both all-sky and calibration).

Identifying periods of high flaring (flaregti)

The flaregti function searches the event lists for periods where there is high soft-proton flaring - any periods where there are not are defined as good-time-intervals (GTIs) and will be used to clean the event lists later. This DAXA functions acts as an interface to the eSASS tool of the same name, which determines periods of flaring when the generated light-curve exceeds a set threshold - it also attempts to create a mask to remove sources prior to the final generation and assessment of the lightcurves.

The following arguments can be passed:

  • pimin - Controls the lower energy bound for creating lightcurves used to determine badly flared times. Default is 0.2 keV.

  • pimax - Controls the upper energy bound for creating lightcurves used to determine badly flared times. Default is 10.0 keV.

  • mask_pimin - Controls the lower energy bound for data to perform source detection of emission on - in order to mask sources to account for their variability when determining flared time periods.. Default is 0.2 keV.

  • mask_pimax - Controls the upper energy bound for data to perform source detection of emission on - in order to mask sources to account for their variability when determining flared time periods. Default is 10.0 keV.

  • binsize - The X and Y binning size for the image on which source detection is performed to create a mask (in eROSITA sky pixels).

  • detml - The detection likelihood threshold for sources that will be included in the mask creation.

  • timebin - The time binning applied to the lightcurve prior to the count-rate threshold checks.

  • source_size - The size of source for which a source likelihood is computed when creating source lists to generate a mask.

  • source_like - The likelihood used to ‘detect’ a source which is then used to minimise the detected source rate to decide on the threshold for flaring events.

  • threshold - The count-rate threshold above which the light curve is considered flared. If a positive value is set it acts as an absolute threshold for the entire observation under consideration, whereas if a negative threshold is set here the threshold is computed dynamically on a spatial grid. We set the default value to be negative.

  • max_threshold - If positive, this limits the threshold values that are dynamically computed (if threshold is negative) so that they can only be less than max_threshold. By default, the value of this argument is negative, in which case no maximum is applied

  • mask_iter - The number of iterations of masking, flare determination, and redection used in the creation of the final good-time intervals. The default is 3.

[4]:
help(flaregti)
Help on function flaregti in module daxa.process.erosita.clean:

flaregti(obs_archive: daxa.archive.base.Archive, pimin: astropy.units.quantity.Quantity = <Quantity 200. eV>, pimax: astropy.units.quantity.Quantity = <Quantity 10000. eV>, mask_pimin: astropy.units.quantity.Quantity = <Quantity 200. eV>, mask_pimax: astropy.units.quantity.Quantity = <Quantity 10000. eV>, binsize: int = 1200, detml: Union[float, int] = 10, timebin: astropy.units.quantity.Quantity = <Quantity 20. s>, source_size: astropy.units.quantity.Quantity = <Quantity 25. arcsec>, source_like: Union[float, int] = 10, threshold: astropy.units.quantity.Quantity = <Quantity -1. ct / (deg2 s)>, max_threshold: astropy.units.quantity.Quantity = <Quantity -1. ct / (deg2 s)>, mask_iter: int = 3, num_cores: int = 115, disable_progress: bool = False, timeout: astropy.units.quantity.Quantity = None)
    The DAXA wrapper for the eROSITA eSASS task flaregti, which attempts to identify good time intervals with
    minimal flaring. This has been tested up to flaregti v1.20.

    This function does not generate final event lists, but instead is used to create good-time-interval files
    which are then applied to the creation of final event lists, along with other user-specified filters, in the
    'cleaned_evt_lists' function.

    :param Archive obs_archive: An Archive instance containing eROSITA mission instances with observations for
        which flaregti should be run. This function will fail if no eROSITA missions are present in the archive.
    :param float pimin:  Lower PI bound of energy range for lightcurve creation.
    :param float pimax:  Upper PI bound of energy range for lightcurve creation.
    :param float mask_pimin: Lower PI bound of energy range for finding sources to mask.
    :param float mask_pimax: Upper PI bound of energy range for finding sources to mask.
    :param int binsize: Bin size of mask image (unit: sky pixels).
    :param int detml: Likelihood threshold for mask creation.
    :param int timebin: Bin size for lightcurve (unit: seconds).
    :param int source_size: Diameter of source extracton area for dynamic threshold calculation (unit: arcsec);
        this is the most important parameter if optimizing for extended sources.
    :param int source_like: Source likelihood for automatic threshold calculation.
    :param float threshold: Flare threshold; dynamic if negative (unit: counts/deg^2/sec).
    :param float max_threshold: Maximum threshold rate, if positive (unit: counts/deg^2/sec),
        if set this forces the threshold to be this rate or less.
    :param int mask_iter: Number of repetitions of source masking and GTI creation.
    :param int num_cores: The number of cores to use, default is set to 90% of available.
    :param bool disable_progress: Setting this to true will turn off the SAS generation progress bar.
    :param Quantity timeout: The amount of time each individual process is allowed to run for, the default is None.
        Please note that this is not a timeout for the entire flaregti process, but a timeout for individual
        ObsID-Inst-subexposure processes.
    :return: Information required by the eSASS decorator that will run commands. Top level keys of any dictionaries are
        internal DAXA mission names, next level keys are ObsIDs. The return is a tuple containing a) a dictionary of
        bash commands, b) a dictionary of final output paths to check, c) a dictionary of extra info (in this case
        obs and analysis dates), d) a generation message for the progress bar, e) the number of cores allowed, and
        f) whether the progress bar should be hidden or not.
    :rtype: Tuple[dict, dict, dict, str, int, bool, Quantity]

Applying event filtering and good-time-intervals to the raw event lists (cleaned_evt_lists)

This function (cleaned_evt_lists) creates the final, filtered and cleaned, event lists for eROSITA data. We make use of the evtool eSASS task for this. Our function will apply the good-time intervals generated by flaregti, as well as allowing the filtering of events based on pattern, flag, and energy. This is achieved through the passage of the following arguments:

  • lo_en - This controls the lowest energy of event allowed into the cleaned event lists - the default is 0.2 keV, the lowest allowed by the eSASS tool.

  • hi_en - This controls the highest energy of event allowed into the cleaned event lists - the default is 10.0 keV, the highest allowed by the eSASS tool.

  • flag - Events are flagged during their initial processing (prior to download) - the flags represent combinations of circumstances, and include information on the owner (MPE or IKE), rejection decision, quality, and whether they are corrupted or not. We use a default value that will select all events flagged as either singly corrupt or as part of a corrupt frame.

  • flag_invert - This controls whether the flag is used to define which events to select or which to exclude. It is often easier to define the bad events with a flag and then invert it, which is the default behaviour here - any event selected by flag will be excluded, unless flag_invert is set to False.

  • pattern - Defines which event patterns are acceptable (where a pattern describes how an event was registered by the detector (this discusses eROSITA pattern fractions. - the default value is 15, which represent 1111 in binary, which in turn means that single, double, triple, and quadruple events are all selected by default. If the absolute highest quality is required, and you have sufficient events, then it may make sense to limit this more, in which case you could pass 1000 (for singles only), or 1010 (for singles and triples), etc.

[5]:
help(cleaned_evt_lists)
Help on function cleaned_evt_lists in module daxa.process.erosita.assemble:

cleaned_evt_lists(obs_archive: daxa.archive.base.Archive, lo_en: astropy.units.quantity.Quantity = <Quantity 0.2 keV>, hi_en: astropy.units.quantity.Quantity = <Quantity 10. keV>, flag: int = 3221225472, flag_invert: bool = True, pattern: int = 15, num_cores: int = 115, disable_progress: bool = False, timeout: astropy.units.quantity.Quantity = None)
    The function wraps the eROSITA eSASS task evtool, which is used for selecting events.
    This has been tested up to evtool v2.10.1

    This function is used to apply the soft-proton filtering (along with any other filtering you may desire, including
    the setting of energy limits) to eROSITA event lists, resulting in the creation of sets of cleaned event lists
    which are ready to be analysed.

    :param Archive obs_archive: An Archive instance containing eROSITA mission instances with observations for
        which cleaned event lists should be created. This function will fail if no eROSITA missions are present in
        the archive.
    :param Quantity lo_en: The lower bound of an energy filter to be applied to the cleaned, filtered, event lists. If
        'lo_en' is set to an Astropy Quantity, then 'hi_en' must be as well. Default is 0.2 keV, which is the
        minimum allowed by the eROSITA toolset. Passing None will result in the default value being used.
    :param Quantity hi_en: The upper bound of an energy filter to be applied to the cleaned, filtered, event lists. If
        'hi_en' is set to an Astropy Quantity, then 'lo_en' must be as well. Default is 10 keV, which is the
        maximum allowed by the eROSITA toolset. Passing None will result in the default value being used.
    :param int flag: FLAG parameter to select events based on owner, information, rejection, quality, and corrupted
        data. The eROSITA website contains the full description of event flags in section 1.1.2 of the following link:
        https://erosita.mpe.mpg.de/edr/DataAnalysis/prod_descript/EventFiles_edr.html. The default parameter will
        select all events flagged as either singly corrupt or as part of a corrupt frame.
    :param bool flag_invert: If set to True, this function will discard all events selected by the flag parameter.
        This is the default behaviour.
    :param int pattern: Selects events of a certain pattern chosen by the integer key. The default of 15 selects
        all four of the recognized legal patterns.
    :param int num_cores: The number of cores to use, default is set to 90% of available.
    :param bool disable_progress: Setting this to true will turn off the eSASS generation progress bar.
    :param Quantity timeout: The amount of time each individual process is allowed to run for, the default is None.
        Please note that this is not a timeout for the entire cleaned_evt_lists process, but a timeout for individual
        ObsID-Inst-subexposure processes.