The following is information about the AstroData class. For descriptions of arguments shown for the class constructor, see AstroData.__init__(..). This documentation is generated in part from in-source docstrings.
To import the AstroData class use:
from astrodata import AstroData
The AstroData class abstracts datasets stored in MEF files and provides uniform interfaces for working on datasets from different instruments and modes. Configuration packages are used to describe the specific data characteristics, layout, and to store type-specific implementations.
MEFs can be generalized as lists of header-data units (HDU), with key-value pairs populating headers, and pixel values populating the data array. AstroData interprets a MEF as a single complex entity. The individual “extensions” within the MEF are available using Python list (“[]”) syntax; they are wrapped in AstroData objects (see AstroData.__getitem__()). AstroData uses pyfits for MEF I/O and numpy for pixel manipulations.
While the pyfits and numpy objects are available to the programmer, AstroData provides analogous methods for most pyfits functionalities which allows it to maintain the dataset as a cohesive whole. The programmer does however use the numpy.ndarrays directly for pixel manipulation. Simple AstroData arithmetic is also provided by the astrodata.adutils.arith module which implement AstroData methods for addition, subtraction, multiplication and division.
In order to identify types of dataset and provide type-specific behavior, AstroData relies on configuration packages either in the PYTHONPATH environment variable or the Astrodata package environment variables, ADCONFIGPATH and RECIPEPATH. A configuration package (eg. astrodata_Gemini) contains definitions for all instruments and modes. A configuration package contains type definitions, meta-data functions, information lookup tables, and any other code or information needed to handle specific types of dataset.
This allows AstroData to manage access to the dataset for convenience and consistency. For example, AstroData is able:
In general, the purpose of AstroData is to provide smart dataset-oriented interfaces that adapt to dataset type. The primary interfaces are for file handling, dataset-type checking, and managing meta-data, but AstroData also integrates other functionalities.
The AstroData constructor constructs an in-memory representation of a dataset. If given a filename it uses pyfits to open the dataset, reads the header and detects applicable types. Binary data, such as pixel data, is left on disk until referenced.
Parameters: |
|
---|
Parameters: |
|
---|
This function appends header-data units (HDUs) to the AstroData instance.
The close(..) function will close the HDUList associated with this AstroData instance.
Parameters: |
|
---|
This function inserts header-data units (HDUs) to the AstroData instance.
The info(..) function prints to the shell information regarding the phu and the extensions found in an AstroData object. It is a high-level wrappers for infostr(..)
Parameters: |
|
---|
The infostr(..) function is used to get a string ready for display either as plain text or HTML. It provides AstroData-relative information.
Parameters: |
|
---|
type prefix: string :param suffix: Add a suffix to filename. type suffix: string
The write function acts similarly to the pyfits HDUList.writeto(..) function if a filename is given, or like pyfits.HDUList.update(..) if no name is given, using whatever the current name is set to. When a name is given, this becomes the new name of the AstroData object and will be used on subsequent calls to write for which a filename is not provided. If the clobber flag is False (the default) then write(..) throws an exception if the file already exists.
Parameters: | typenames (string or list of strings) – specifies the type name to check. |
---|---|
Returns: | True if the given types all apply to this dataset, False otherwise. |
Return type: | Bool |
This function checks the AstroData object to see if it is the given type(s) and returns True if so. If a list of types is given as inputs, all the types must match the AstroData object.
Note : | AstroData.check_type(..) is an alias for AstroData.is_type(..). |
---|
Parameters: | prune (bool) – flag which controls ‘pruning’ the returned type list so that only the leaf node type for a given set of related types is returned. |
---|---|
Returns: | a list of classification names that apply to this data |
Return type: | list of strings |
The get_types(..) function returns a list of type names, where type names are as always, strings. It is possible to ‘prune’ the list so that only leaf nodes are returned, which is useful when leaf nodes take precedence such as for descriptors.
KL: Please add definition of “leaf node”.
Note: types are divided into two categories, one intended for types which represent processing status (i.e. RAW vs PREPARED), and another which contains a more traditional ‘typology’ consisting of a hierarchical tree of dataset types. This latter tree maps roughly to instrument-modes, with instrument types branching from the general observatory type, (e.g. ‘GEMINI’).
To retrieve only status types, use get_status(..); to retreive just typological types use get_typology(..). Note that the system does not enforce what checks are actually performed by types in each category, that is, one could miscategorize a type when authoring a configuration package. Both classifications use the same DataClassification objects to classify datasets. It is up to those implementing the type-specific configuration package to ensure types related to status appear in the correct part of the configuration space.
Currently the distinction betwen status and typology is not used by the system (e.g. in type-specific default recipe assignments) and is provided as a service for higher level code, e.g. primitives and scripts which make use of the distinction.
This function returns the set of type names (strings) which apply to this dataset and which come from the status section of the AstroData Type library. ‘Status’ classifications are those which tend to change during the reduction of a dataset based on the amount of processing, e.g. RAW vs PREPARED. Strictly, a ‘status’ type is any type defined in or below the status part of the classification directory within the configuration package. For example, in the Gemini type configuration this means any type definition files in or below the ‘astrodata_Gemini/ADCONFIG/classification/status’ directory.
Returns: | a list of string classification names |
---|---|
Return type: | list of strings |
This function returns the set of type names (strings) which apply to this dataset and which come from the typology section of the AstroData Type library. ‘Typology’ classifications are those which tend to remain with the data in spite of reduction status, e.g. those related to the instrument-mode of the dataset or of the datasets used to produce it. Strictly these consist of any type defined in or below the correct configuration directory, for example, in Gemini’s configuration package, it would be anything in the “astrodata_Gemini/ADCONFIG/classification/types” directory.
Returns: | a list of classification name strings |
---|---|
Return type: | list of strings |
Manipulations of headers, specifically retrieving and setting key-value pair settings in the header section of header-data units can be done directly using the AstroData header manipulation functions which cover both PHU and extension headers. For higher level metadata which is available for all types in the tree in a properly constructed configuration space, the metadata is retrieved with descriptor functions, accessed as members of the AstroData object.
To retrieve or set meta-data not covered by descriptors, one must read and write key-value pairs to the HDU headers at the lower-level. AstroData offers three pairs of functions for getting and setting header values, for each of three distinct cases. While it is possible to use the pyfits.Header directly (available via “ad[..].header”), it is preferrable to use the AstroData calls which allow AstroData to keep type information up to date, as well as to update any other characteristics of the AstroData object which may need to be maintained when the dataset is changed.
The three distinct pairs of header access functions serve the following purposes:
Parameters: | key (string) – name of header value to retrieve |
---|---|
Return type: | string |
Returns: | the key’s value as string or None if not present. |
The phu_get_key_value(..) function returns the value associated with the given key within the primary header unit of the dataset. The value is returned as a string (storage format) and must be converted as necessary by the caller.
Add or update a keyword in the PHU of the AstroData object with a specific value and, optionally, a comment
Parameters: |
|
---|
Parameters: | key (string) – name of header value to set |
---|---|
Returns: | the specified value |
Return type: | string |
The get_key_value(..) function is used to get the value associated with a given key in the data-header unit of a single-HDU AstroData instance (such as returned by iteration).
Note : | Single extension AstroData objects are those with only a single header-data unit besides the PHU. They may exist if a single extension file is loaded, but in general are produced by indexing or iteration instructions, i.e.:
The variable “sead” above is ensured to hold a single extension AstroData object, and can be used more convieniently. |
---|
Parameters: |
|
---|
The set_key_value(..) function is used to set the value (and optionally the comment) associated with a given key in the data-header of a single-HDU AstroData instance. The value argument will be converted to string, so it must have a string operator member function or be passed in as string.
Note : | Single extension AstroData objects are those with only a single header-data unit besides the PHU. They may exist if a single extension file is loaded, but in general are produced by indexing or iteration instructions, i.e.:
The variable “sead” above is ensured to hold a single extension AstroData object, and can be used more convieniently. |
---|
Parameters: |
|
---|---|
Return type: | string |
Returns: | the value associated with the key, or None if not present |
This function returns the value from the given extension’s header, with “0” being the first data extension. To get values from the PHU use phu_get_key_value(..).
Add or update a keyword in the header of an extension of the AstroData object with a specific value and, optionally, a comment. To add or update a keyword in the PHU of the AstroData object, use phu_set_key_value().
Parameters: |
|
---|
Parameters: | extname (string) – the name of the extension, equivalent to the value associated with the “EXTNAME” key in the extension header. |
---|---|
Returns: | number of extensions of that name |
Return type: | int |
The count_exts(..) function counts the extensions of a given name (as stored in the HDUs “EXTNAME” header).
Parameters: | ext (string, int, or tuple) – The integer index, an indexing (EXTNAME, EXTVER) tuple, or EXTNAME name. If an int or tuple, the single extension identified is wrapped with an AstroData instance, and “single-extension” members of the AstroData object can be used. If a string, EXTNAME, is given, then all extensions with the given EXTNAME will be wrapped by the new AstroData instance. |
---|---|
Returns: | an AstroData instance associated with the subset of data. |
Return type: | AstroData |
This function supports the “[]” syntax for AstroData instances, e.g. ad[(“SCI”,1)]. We use it to create AstroData objects associated with “subdata” of the parent AstroData object, that is, consisting of an HDUList made up of some subset of the parent MEF. e.g.:
from astrodata import AstroData
datasetA = AstroData("datasetMEF.fits")
datasetB = datasetA[SCI]
In this case, after the operations, datasetB is an AstroData object associated with the same MEF, sharing some of the the same actual HDUs in memory as datasetA. The object in datasetB will behave as if the SCI extensions are its only members, and it does in fact have its own pyfits.HDUList. Note that datasetA and datasetB share the PHU and also the data structures of the HDUs they have in common, so that a change to datasetA[('SCI',1)].data will change the datasetB[('SCI',1)].data member and vice versa. They are in fact both references to the same numpy array in memory. The HDUList is a different list, however, that references common HDUs. If a subdata related AstroData object is written to disk, the resulting MEF will contain only the extensions in the subdata’s HDUList.
Note : | Integer extensions start at 0 for the data-containing extensions, not at the PHU as with pyfits. This is important: ad[0] is the first content extension, in a traditional MEF perspective, the extension AFTER the PHU; it is not the PHU! In AstroData instances, the PHU is purely a header, and not counted as an extension in the way that headers generally are not counted as their own elements in the array they contain meta-data for. The PHU can be accessed via the phu AstroData member of using the PHU related member functions. |
---|
The data property can only be used for single-HDU AstroData instances, such as those returned during iteration. It is a property attribute which uses get_data(..) and set_data(..) to access the data members with “=” syntax. To set the data member, use ad.data = newdata, where newdata must be a numpy array. To get the data member, use npdata = ad.data.
Returns: | data array associated with the single extension |
---|---|
Return type: | numpy.ndarray |
The get_data(..) member is the function behind the property-style “data” member and returns appropriate HDU’s data member(s) specifically for the case in which the AstroData instance has ONE HDU (in addition to the PHU). This allows a single-extension AstroData, such as AstroData generates through iteration, to be used as though it simply is just the one extension. One is dealing with single extension AstroData instances when iterating over the AstroData extensions and when picking out an extension by integer or tuple indexing, e.g.:
for ad in dataset[SCI]:
# ad is a single-HDU index
ad.data = newdata
# assuming the named extension exists,
# sd will be a single-HDU AstroData
sd = dataset[("SCI",1)]
Parameters: | newdata (numpy.ndarray) – new data objects |
---|---|
Raises Errors.SingleHDUMemberExcept: | |
if AstroData instance has more than one extension (not including PHU). |
This function sets the data member of a data section of an AstroData object, specifically for the case in which the AstroData instance has ONE header-data unit (in addition to PHU). This case is assured when iterating over the AstroData extensions, as in:
for ad in dataset[SCI]:
...
The header property can only be used for single-HDU AstroData instances, such as those returned during iteration. It is a property attribute which uses get_header(..) and set_header(..) to access the header member with the “=” syntax. To set the header member, use ad.header = newheader, where newheader must be a pyfits.Header object. To get the header member, use hduheader = ad.header.
Returns: | header |
---|---|
Return type: | pyfits.Header |
Raises Errors.SingleHDUMemberExcept: | |
Will raise an exception if more than one extension exists. (note: The PHU is not considered an extension in this case) |
The get_header(..) function returns the header member for Single-HDU AstroData instances, if extension is None (which are those that have only one extension plus PHU). This case can be assured when iterating over extensions using AstroData, e.g.:
for ad in dataset[SCI]:
...
Otherwise, the extension can be specified. Either way, only one header for one extension is returned.
Parameters: |
|
---|---|
Raises Errors.SingleHDUMemberExcept: | |
Will raise an exception if more than one extension exists. |
The set_header(..) function sets the extension header member for single extension, if extension is None (which are those that have only one extension plus PHU). This case is assured when iterating over extensions using AstroData, e.g.:
- for ad in dataset[SCI]:
- ...
Otherwise, the extension can be specified. Either way, only one header for one extension is operated upon.
Parameters: |
|
---|
Note: This member only works on single extension AstroData instances.
The rename_ext(..) function is used in order to rename an HDU with a new EXTNAME and EXTVER identifier. Merely changing the EXTNAME and EXTEVER values in the extensions pyfits.Header are not sufficient. Though the values change in the pyfits.Header object, there are special HDU class members which are not updated.
Warning : | This function manipulates private (or somewhat private) HDU members, specifically ‘name’ and ‘_extver’. STSCI has been informed of the issue and has made a special HDU function for performing the renaming. When generally available, this new function will be used instead of manipulating the HDU’s properties directly, and this function will call the new pyfits.HDUList(..) function. |
---|
WARNING!!!! The code is not doing what the docstring claim. - KL Apr 2014
Parameters: | iarray – A list of AstroData instances for which a correlation dictionary will be constructed. |
---|---|
Returns: | a list of tuples containing correlated extensions from the arguments. |
Return type: | list of tuples |
The correlate(..) function is a module-level helper function which returns a list of tuples of Single Extension AstroData instances which associate extensions from each listed AstroData object, to identically named extensions among the rest of the input array. The correlate(..) function accepts a variable number of arguments, all of which should be AstroData instances.
The function returns a structured dictionary of dictionaries of lists of AstroData objects. For example, given three inputs, ad, bd and cd, all with three “SCI”, “VAR” and “DQ” extensions. Given adlist = [ad, bd, cd], then corstruct = correlate(adlist) will return to corstruct a dictionary first keyed by the EXTNAME, then keyed by tuple. The contents (e.g. of corstruct[“SCI”][1]) are just a list of AstroData instances each containing a header-data unit from ad, bd, and cd respectively.
Info : | to appear in the list, all the given arguments must have an extension with the given (EXTNAME,EXTVER) for that tuple. |
---|
Parameters: |
|
---|---|
Returns: | an AstroData instance initialized with appropriate header-data units such as the PHU, Standard Gemini headers and with type-specific associated data-header units such as binary table Mask Definition tables (MDF). |
Return type: | AstroData |
Info : | File will not have been written to disk by prep_output(..). |
The prep_output(..) function creates a new AstroData object ready for appending output information (e.g. ad.append(..)). While you can also create an empty AstroData object by giving no arguments to the AstroData constructor (i.e. ad = AstroData()), prep_output(..) exists for the common case where a new dataset object is intended as the output of some processing on a list of source datasets, and some information from the source inputs must be propagated.
The prep_output(..) function makes use of this knowledge to ensure the file meets standards in what is considered a complete output file given such a combination. In the future this function can make use of dataset history and structure definitions in the ADCONFIG configuration space. As prep_output improves, scripts and primitives that use it will benefit in a forward compatible way, in that their output datasets will benefit from more automatic propagation, validations, and data flow control, such as the emergence of history database propagation.
Presently, it already provides the following:
Parameters: |
|
---|---|
Returns: | a list of matching keys |
Return type: | list of strings |
This utility function returns a list of keys from the input header that match the regular expression.