Data distribution

Since 2005B, all data obtained at both Gemini telescopes in normal queue or classical modes have been distributed in electronic form from the Gemini Science Archive hosted at the Canadian Astronomy Data Centre (CADC).

The data flow process is as follows. After data quality assessment, the science and calibration frames are uploaded to the GSA. Once a week, Gemini staff revise data ingestion and generate the "PI packages" for retrieval. Currently the PI package contains raw science data (including calibrations taken as part of the program) and ancillary files (e.g. observing logs, weather images, etc). An e-mail notification is sent to the PI (or principal contact) when data are ready for retrieval. This e-mail is copied to both the NGO and Gemini Contact Scientist. In addition, the PI can retrieve their own data, but not ancillary data, as well as any non-proprietary data already in the GSA, at any time.

The average time to ingest files from gsag[ns]/dataflow is approximately 30-38min. Ingestion from Gemini North is slower on average than from Gemini South, perhaps because of the higher frame rate from NIRI. The median is significantly less: 12-15 minutes. This is probably a more accurate measurement of overall ingestion time for a single "random" file. This is the time to ingest a single frame, so it applies to the raw data sets. The transfer slows considerably when a large number of frames is queued for ingestion at the same time. When the electronic transfer was designed, it was not supposed to be used as a place holder for "eavesdropping", so if a PI absolutely needs to be able to look at the data before the next step in the observation is taken (for example), other arrangements have to be agreed upon, as we cannot guarantee the normal transfer will be fast enough.

Implementation of "observing classes" in the OT now permits most calibrations to be directly associated with the program and therefore these are also available within the data package. Raw (and in some cases processed) calibration data can be retrieved using the "Search Complete Catalogue" query page at the GSA web site. More details on which are the calibration frames, as well as instructions on how to generate the GSA queries, for each instrument mode are available in the calibration retrieval information.

Please note, if you do not have a CADC user account you should create one BEFORE you attempt to retrieve your data. Without this you will not be able to download data from CADC. Go to the GSA homepage and click on Register at top-right. A simple form needs to complete and within 24 hours you will receive an e-mail confirming that your account is active. In addition to your CADC account information, to access your PI data you will also need the Gemini Program ID (e.g. GN-2004B-Q-28) and the corresponding Phase II program key (e-mailed to you when you were notified of your successful observing time allocation).

If you are not able to download your "PI package" from the GSA due to bandwidth restrictions, please use the HelpDesk to request that the data be sent via hard-media. This is restricted to DVD or DAT 3 tapes (no CDs). DVD burners/readers are not quite standard yet so please check via the Helpdesk before requesting data to be sent to you this way. Since the data will be shipped from either Hawaii or Chile, we strongly encourage PIs to use the e-distribution to avoid delays and import fees.

Current operational model 

  • Data acquired at Gemini North and Gemini South using facility instruments is sent directly to GSA via electronic transfer in 'raw' form.
  • Ancillary meta-data is ingested into the MDDB at CADC.
  • Gemini staff perform data quality control and assessment on 'raw' data and re-ingest checked data to GSA. This process can take a few days.
  • All data and meta-data are available to PIs (with Phase 2 keycode) as soon as it is ingested into GSA.
  • All data and meta-data are available to the community after the proprietary period of 18 months has expired (3 months for SV data).
  • Meta-data (sucha as dataset headers) are available to the community immediately upon ingest unless metadata restriction is required by the PI. This mode is active for data acquired starting 2008B and must be requested during PhaseI or at the time of PhaseII submission (before a program is started). Please contact the Gemini Associate Director for Science Operations for more information.