Gen3 Submission

class gen3.submission.Gen3Submission(endpoint, auth_provider)[source]

Bases: object

Submit/Export/Query data from a Gen3 Submission system.

A class for interacting with the Gen3 submission services. Supports submitting and exporting from Sheepdog. Supports GraphQL queries through Peregrine.

Parameters:
  • endpoint (str) – The URL of the data commons.
  • auth_provider (Gen3Auth) – A Gen3Auth class instance.

Examples

This generates the Gen3Submission class pointed at the sandbox commons while using the credentials.json downloaded from the commons profile page.

>>> endpoint = "https://nci-crdc-demo.datacommons.io"
... auth = Gen3Auth(endpoint, refresh_file="credentials.json")
... sub = Gen3Submission(endpoint, auth)
create_program(json)[source]

Create a program. :param json: The json of the program to create :type json: object

Examples

This creates a program in the sandbox commons.

>>> Gen3Submission.create_program(json)
create_project(program, json)[source]

Create a project. :param program: The program to create a project on :type program: str :param json: The json of the project to create :type json: object

Examples

This creates a project on the DCF program in the sandbox commons.

>>> Gen3Submission.create_project("DCF", json)
delete_program(program)[source]

Delete a program.

This deletes an empty program from the commons.

Parameters:program (str) – The program to delete.

Examples

This deletes the “DCF” program.

>>> Gen3Submission.delete_program("DCF")
delete_project(program, project)[source]

Delete a project.

This deletes an empty project from the commons.

Parameters:
  • program (str) – The program containing the project to delete.
  • project (str) – The project to delete.

Examples

This deletes the “CCLE” project from the “DCF” program.

>>> Gen3Submission.delete_project("DCF", "CCLE")
delete_record(program, project, uuid)[source]

Delete a record from a project. :param program: The program to delete from. :type program: str :param project: The project to delete from. :type project: str :param uuid: The uuid of the record to delete :type uuid: str

Examples

This deletes a record from the CCLE project in the sandbox commons.

>>> Gen3Submission.delete_record("DCF", "CCLE", uuid)
export_node(program, project, node_type, fileformat, filename=None)[source]

Export all records in a single node type of a project.

Parameters:
  • program (str) – The program to which records belong.
  • project (str) – The project to which records belong.
  • node_type (str) – The name of the node to export.
  • fileformat (str) – Export data as either ‘json’ or ‘tsv’
  • filename (str) – Name of the file to export to; if no filename is provided, prints data to screen

Examples

This exports all records in the “sample” node from the CCLE project in the sandbox commons.

>>> Gen3Submission.export_node("DCF", "CCLE", "sample", "tsv", filename="DCF-CCLE_sample_node.tsv")
export_record(program, project, uuid, fileformat, filename=None)[source]

Export a single record into json.

Parameters:
  • program (str) – The program the record is under.
  • project (str) – The project the record is under.
  • uuid (str) – The UUID of the record to export.
  • fileformat (str) – Export data as either ‘json’ or ‘tsv’
  • filename (str) – Name of the file to export to; if no filename is provided, prints data to screen

Examples

This exports a single record from the sandbox commons.

>>> Gen3Submission.export_record("DCF", "CCLE", "d70b41b9-6f90-4714-8420-e043ab8b77b9", "json", filename="DCF-CCLE_one_record.json")
get_dictionary_all()[source]

Returns the entire dictionary object for a commons.

This gets a json of the current dictionary schema for a commons.

Examples

This returns the dictionary schema for a commons.

>>> Gen3Submission.get_dictionary_all()
get_dictionary_node(node_type)[source]

Returns the dictionary schema for a specific node.

This gets the current json dictionary schema for a specific node type in a commons.

Parameters:node_type (str) – The node_type (or name of the node) to retrieve.

Examples

This returns the dictionary schema the “subject” node.

>>> Gen3Submission.get_dictionary_node("subject")
get_graphql_schema()[source]

Returns the GraphQL schema for a commons.

This runs the GraphQL introspection query against a commons and returns the results.

Examples

This returns the GraphQL schema.

>>> Gen3Submission.get_graphql_schema()
get_programs()[source]

List registered programs

get_project_dictionary(program, project)[source]

Get dictionary schema for a given project

Parameters:
  • program – the name of the program the project is from
  • project – the name of the project you want the dictionary schema from

Example

>>> Gen3Submission.get_project_dictionary("DCF", "CCLE")
get_project_manifest(program, project)[source]

Get a projects file manifest

Parameters:
  • program – the name of the program the project is from
  • project – the name of the project you want the manifest from

Example

>>> Gen3Submission.get_project_manifest("DCF", "CCLE")
get_projects(program)[source]

List registered projects for a given program

Parameters:program – the name of the program you want the projects from

Example

This lists all the projects under the DCF program

>>> Gen3Submission.get_projects("DCF")
open_project(program, project)[source]

Mark a project open. Opening a project means uploads, deletions, etc. are allowed.

Parameters:
  • program – the name of the program the project is from
  • project – the name of the project you want to ‘open’

Example

>>> Gen3Submission.get_project_manifest("DCF", "CCLE")
query(query_txt, variables=None, max_tries=1)[source]

Execute a GraphQL query against a data commons.

Parameters:
  • query_txt (str) – Query text.
  • variables (object, optional) – Dictionary of variables to pass with the query.
  • max_tries (int, optional) – Number of times to retry if the request fails.

Examples

This executes a query to get the list of all the project codes for all the projects in the data commons.

>>> query = "{ project(first:0) { code } }"
... Gen3Submission.query(query)
submit_file(project_id, filename, chunk_size=30, row_offset=0)[source]

Submit data in a spreadsheet file containing multiple records in rows to a Gen3 Data Commons.

Parameters:
  • project_id (str) – The project_id to submit to.
  • filename (str) – The file containing data to submit. The format can be TSV, CSV or XLSX (first worksheet only for now).
  • chunk_size (integer) – The number of rows of data to submit for each request to the API.
  • row_offset (integer) – The number of rows of data to skip; ‘0’ starts submission from the first row and submits all data.

Examples

This submits a spreadsheet file containing multiple records in rows to the CCLE project in the sandbox commons.

>>> Gen3Submission.submit_file("DCF-CCLE","data_spreadsheet.tsv")
submit_record(program, project, json)[source]

Submit record(s) to a project as json.

Parameters:
  • program (str) – The program to submit to.
  • project (str) – The project to submit to.
  • json (object) – The json defining the record(s) to submit. For multiple records, the json should be an array of records.

Examples

This submits records to the CCLE project in the sandbox commons.

>>> Gen3Submission.submit_record("DCF", "CCLE", json)