jobs

Creates, updates, deletes, gets or lists a jobs resource.

Overview

Name	`jobs`
Type	Resource
Id	`google.dataflow.jobs`

Fields

Name	Datatype	Description
`id`	`string`	The unique ID of this job. This field is set by the Dataflow service when the job is created, and is immutable for the life of the job.
`name`	`string`	Optional. The user-specified Dataflow job name. Only one active job with a given name can exist in a project within one region at any given time. Jobs in different regions can have the same name. If a caller attempts to create a job with the same name as an active job that already exists, the attempt returns the existing job. The name must match the regular expression `[a-z]([-a-z0-9]{0,1022}[a-z0-9])?`
`clientRequestId`	`string`	The client's unique identifier of the job, re-used across retried attempts. If this field is set, the service will ensure its uniqueness. The request to create a job will fail if the service has knowledge of a previously submitted job with the same client's ID and job name. The caller may use this field to ensure idempotence of job creation across retried attempts to create a job. By default, the field is empty and, in that case, the service ignores it.
`createTime`	`string`	The timestamp when the job was initially created. Immutable and set by the Cloud Dataflow service.
`createdFromSnapshotId`	`string`	If this is specified, the job's initial state is populated from the given snapshot.
`currentState`	`string`	The current state of the job. Jobs are created in the `JOB_STATE_STOPPED` state unless otherwise specified. A job in the `JOB_STATE_RUNNING` state may asynchronously enter a terminal state. After a job has reached a terminal state, no further state updates may be made. This field might be mutated by the Dataflow service; callers cannot mutate it.
`currentStateTime`	`string`	The timestamp associated with the current state.
`environment`	`object`	Describes the environment in which a Dataflow Job runs.
`executionInfo`	`object`	Additional information about how a Cloud Dataflow job will be executed that isn't contained in the submitted job.
`jobMetadata`	`object`	Metadata available primarily for filtering jobs. Will be included in the ListJob response and Job SUMMARY view.
`labels`	`object`	User-defined labels for this job. The labels map can contain no more than 64 entries. Entries of the labels map are UTF8 strings that comply with the following restrictions: Keys must conform to regexp: \p{Ll}\p{Lo}{0,62} Values must conform to regexp: [\p{Ll}\p{Lo}\p{N}_-]{0,63} * Both keys and values are additionally constrained to be <= 128 bytes in size.
`location`	`string`	Optional. The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains this job.
`pipelineDescription`	`object`	A descriptive representation of submitted pipeline as well as the executed form. This data is provided by the Dataflow service for ease of visualizing the pipeline and interpreting Dataflow provided metrics.
`projectId`	`string`	The ID of the Google Cloud project that the job belongs to.
`replaceJobId`	`string`	If this job is an update of an existing job, this field is the job ID of the job it replaced. When sending a `CreateJobRequest`, you can update a job by specifying it here. The job named here is stopped, and its intermediate state is transferred to this job.
`replacedByJobId`	`string`	If another job is an update of this job (and thus, this job is in `JOB_STATE_UPDATED`), this field contains the ID of that job.
`requestedState`	`string`	The job's requested state. Applies to `UpdateJob` requests. Set `requested_state` with `UpdateJob` requests to switch between the states `JOB_STATE_STOPPED` and `JOB_STATE_RUNNING`. You can also use `UpdateJob` requests to change a job's state from `JOB_STATE_RUNNING` to `JOB_STATE_CANCELLED`, `JOB_STATE_DONE`, or `JOB_STATE_DRAINED`. These states irrevocably terminate the job if it hasn't already reached a terminal state. This field has no effect on `CreateJob` requests.
`runtimeUpdatableParams`	`object`	Additional job parameters that can only be updated during runtime using the projects.jobs.update method. These fields have no effect when specified during job creation.
`satisfiesPzi`	`boolean`	Output only. Reserved for future use. This field is set only in responses from the server; it is ignored if it is set in any requests.
`satisfiesPzs`	`boolean`	Reserved for future use. This field is set only in responses from the server; it is ignored if it is set in any requests.
`serviceResources`	`object`	Resources used by the Dataflow Service to run the job.
`stageStates`	`array`	This field may be mutated by the Cloud Dataflow service; callers cannot mutate it.
`startTime`	`string`	The timestamp when the job was started (transitioned to JOB_STATE_PENDING). Flexible resource scheduling jobs are started with some delay after job creation, so start_time is unset before start and is updated when the job is started by the Cloud Dataflow service. For other jobs, start_time always equals to create_time and is immutable and set by the Cloud Dataflow service.
`steps`	`array`	Exactly one of step or steps_location should be specified. The top-level steps that constitute the entire job. Only retrieved with JOB_VIEW_ALL.
`stepsLocation`	`string`	The Cloud Storage location where the steps are stored.
`tempFiles`	`array`	A set of files the system should be aware of that are used for temporary storage. These temporary files will be removed on job completion. No duplicates are allowed. No file patterns are supported. The supported files are: Google Cloud Storage: storage.googleapis.com/{bucket}/{object} bucket.storage.googleapis.com/{object}
`transformNameMapping`	`object`	Optional. The map of transform name prefixes of the job to be replaced to the corresponding name prefixes of the new job.
`type`	`string`	Optional. The type of Dataflow job.

Methods

Name	Accessible by	Required Params	Description
`projects_jobs_get`	`SELECT`	`jobId, projectId`	Gets the state of the specified Cloud Dataflow job. To get the state of a job, we recommend using `projects.locations.jobs.get` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.get` is not recommended, as you can only get the state of jobs that are running in `us-central1`.
`projects_jobs_list`	`SELECT`	`projectId`	List the jobs of a project. To list the jobs of a project in a region, we recommend using `projects.locations.jobs.list` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To list the all jobs across all regions, use `projects.jobs.aggregated`. Using `projects.jobs.list` is not recommended, because you can only get the list of jobs that are running in `us-central1`. `projects.locations.jobs.list` and `projects.jobs.list` support filtering the list of jobs by name. Filtering by name isn't supported by `projects.jobs.aggregated`.
`projects_locations_jobs_get`	`SELECT`	`jobId, location, projectId`	Gets the state of the specified Cloud Dataflow job. To get the state of a job, we recommend using `projects.locations.jobs.get` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.get` is not recommended, as you can only get the state of jobs that are running in `us-central1`.
`projects_locations_jobs_list`	`SELECT`	`location, projectId`	List the jobs of a project. To list the jobs of a project in a region, we recommend using `projects.locations.jobs.list` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To list the all jobs across all regions, use `projects.jobs.aggregated`. Using `projects.jobs.list` is not recommended, because you can only get the list of jobs that are running in `us-central1`. `projects.locations.jobs.list` and `projects.jobs.list` support filtering the list of jobs by name. Filtering by name isn't supported by `projects.jobs.aggregated`.
`projects_jobs_create`	`INSERT`	`projectId`	Creates a Cloud Dataflow job. To create a job, we recommend using `projects.locations.jobs.create` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.create` is not recommended, as your job will always start in `us-central1`. Do not enter confidential information when you supply string values using the API.
`projects_locations_jobs_create`	`INSERT`	`location, projectId`	Creates a Cloud Dataflow job. To create a job, we recommend using `projects.locations.jobs.create` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.create` is not recommended, as your job will always start in `us-central1`. Do not enter confidential information when you supply string values using the API.
`projects_jobs_update`	`REPLACE`	`jobId, projectId`	Updates the state of an existing Cloud Dataflow job. To update the state of an existing job, we recommend using `projects.locations.jobs.update` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.update` is not recommended, as you can only update the state of jobs that are running in `us-central1`.
`projects_locations_jobs_update`	`REPLACE`	`jobId, location, projectId`	Updates the state of an existing Cloud Dataflow job. To update the state of an existing job, we recommend using `projects.locations.jobs.update` with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). Using `projects.jobs.update` is not recommended, as you can only update the state of jobs that are running in `us-central1`.
`projects_jobs_aggregated`	`EXEC`	`projectId`	List the jobs of a project across all regions. Note: This method doesn't support filtering the list of jobs by name.
`projects_jobs_snapshot`	`EXEC`	`jobId, projectId`	Snapshot the state of a streaming job.
`projects_locations_jobs_snapshot`	`EXEC`	`jobId, location, projectId`	Snapshot the state of a streaming job.

`SELECT` examples

List the jobs of a project. To list the jobs of a project in a region, we recommend using projects.locations.jobs.list with a [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints). To list the all jobs across all regions, use projects.jobs.aggregated. Using projects.jobs.list is not recommended, because you can only get the list of jobs that are running in us-central1. projects.locations.jobs.list and projects.jobs.list support filtering the list of jobs by name. Filtering by name isn't supported by projects.jobs.aggregated.

SELECT
id,
name,
clientRequestId,
createTime,
createdFromSnapshotId,
currentState,
currentStateTime,
environment,
executionInfo,
jobMetadata,
labels,
location,
pipelineDescription,
projectId,
replaceJobId,
replacedByJobId,
requestedState,
runtimeUpdatableParams,
satisfiesPzi,
satisfiesPzs,
serviceResources,
stageStates,
startTime,
steps,
stepsLocation,
tempFiles,
transformNameMapping,
type
FROM google.dataflow.jobs
WHERE projectId = '{{ projectId }}';

`INSERT` example

Use the following StackQL query and manifest file to create a new jobs resource.

All Properties
Manifest

/*+ create */
INSERT INTO google.dataflow.jobs (
projectId,
projectId,
name,
type,
environment,
steps,
stepsLocation,
currentState,
currentStateTime,
requestedState,
executionInfo,
replaceJobId,
transformNameMapping,
clientRequestId,
replacedByJobId,
tempFiles,
labels,
location,
pipelineDescription,
stageStates,
jobMetadata,
startTime,
createdFromSnapshotId,
satisfiesPzs,
runtimeUpdatableParams
)
SELECT 
'{{ projectId }}',
'{{ projectId }}',
'{{ name }}',
'{{ type }}',
'{{ environment }}',
'{{ steps }}',
'{{ stepsLocation }}',
'{{ currentState }}',
'{{ currentStateTime }}',
'{{ requestedState }}',
'{{ executionInfo }}',
'{{ replaceJobId }}',
'{{ transformNameMapping }}',
'{{ clientRequestId }}',
'{{ replacedByJobId }}',
'{{ tempFiles }}',
'{{ labels }}',
'{{ location }}',
'{{ pipelineDescription }}',
'{{ stageStates }}',
'{{ jobMetadata }}',
'{{ startTime }}',
'{{ createdFromSnapshotId }}',
{{ satisfiesPzs }},
'{{ runtimeUpdatableParams }}'
;

- name: your_resource_model_name
  props:
    - name: id
      value: string
    - name: projectId
      value: string
    - name: name
      value: string
    - name: type
      value: string
    - name: environment
      value:
        - name: tempStoragePrefix
          value: string
        - name: clusterManagerApiService
          value: string
        - name: experiments
          value:
            - string
        - name: serviceOptions
          value:
            - string
        - name: serviceKmsKeyName
          value: string
        - name: workerPools
          value:
            - - name: kind
                value: string
              - name: numWorkers
                value: integer
              - name: packages
                value:
                  - - name: name
                      value: string
                    - name: location
                      value: string
              - name: defaultPackageSet
                value: string
              - name: machineType
                value: string
              - name: teardownPolicy
                value: string
              - name: diskSizeGb
                value: integer
              - name: diskType
                value: string
              - name: diskSourceImage
                value: string
              - name: zone
                value: string
              - name: taskrunnerSettings
                value:
                  - name: taskUser
                    value: string
                  - name: taskGroup
                    value: string
                  - name: oauthScopes
                    value:
                      - string
                  - name: baseUrl
                    value: string
                  - name: dataflowApiVersion
                    value: string
                  - name: parallelWorkerSettings
                    value:
                      - name: baseUrl
                        value: string
                      - name: reportingEnabled
                        value: boolean
                      - name: servicePath
                        value: string
                      - name: shuffleServicePath
                        value: string
                      - name: workerId
                        value: string
                      - name: tempStoragePrefix
                        value: string
                  - name: baseTaskDir
                    value: string
                  - name: continueOnException
                    value: boolean
                  - name: logToSerialconsole
                    value: boolean
                  - name: alsologtostderr
                    value: boolean
                  - name: logUploadLocation
                    value: string
                  - name: logDir
                    value: string
                  - name: tempStoragePrefix
                    value: string
                  - name: harnessCommand
                    value: string
                  - name: workflowFileName
                    value: string
                  - name: commandlinesFileName
                    value: string
                  - name: vmId
                    value: string
                  - name: languageHint
                    value: string
                  - name: streamingWorkerMainClass
                    value: string
              - name: onHostMaintenance
                value: string
              - name: dataDisks
                value:
                  - - name: sizeGb
                      value: integer
                    - name: diskType
                      value: string
                    - name: mountPoint
                      value: string
              - name: metadata
                value: object
              - name: autoscalingSettings
                value:
                  - name: algorithm
                    value: string
                  - name: maxNumWorkers
                    value: integer
              - name: poolArgs
                value: object
              - name: network
                value: string
              - name: subnetwork
                value: string
              - name: workerHarnessContainerImage
                value: string
              - name: numThreadsPerWorker
                value: integer
              - name: ipConfiguration
                value: string
              - name: sdkHarnessContainerImages
                value:
                  - - name: containerImage
                      value: string
                    - name: useSingleCorePerContainer
                      value: boolean
                    - name: environmentId
                      value: string
                    - name: capabilities
                      value:
                        - string
        - name: userAgent
          value: object
        - name: version
          value: object
        - name: dataset
          value: string
        - name: sdkPipelineOptions
          value: object
        - name: internalExperiments
          value: object
        - name: serviceAccountEmail
          value: string
        - name: flexResourceSchedulingGoal
          value: string
        - name: workerRegion
          value: string
        - name: workerZone
          value: string
        - name: shuffleMode
          value: string
        - name: debugOptions
          value:
            - name: enableHotKeyLogging
              value: boolean
            - name: dataSampling
              value:
                - name: behaviors
                  value:
                    - string
        - name: useStreamingEngineResourceBasedBilling
          value: boolean
        - name: streamingMode
          value: string
    - name: steps
      value:
        - - name: kind
            value: string
          - name: name
            value: string
          - name: properties
            value: object
    - name: stepsLocation
      value: string
    - name: currentState
      value: string
    - name: currentStateTime
      value: string
    - name: requestedState
      value: string
    - name: executionInfo
      value:
        - name: stages
          value: object
    - name: createTime
      value: string
    - name: replaceJobId
      value: string
    - name: transformNameMapping
      value: object
    - name: clientRequestId
      value: string
    - name: replacedByJobId
      value: string
    - name: tempFiles
      value:
        - string
    - name: labels
      value: object
    - name: location
      value: string
    - name: pipelineDescription
      value:
        - name: originalPipelineTransform
          value:
            - - name: kind
                value: string
              - name: id
                value: string
              - name: name
                value: string
              - name: displayData
                value:
                  - - name: key
                      value: string
                    - name: namespace
                      value: string
                    - name: strValue
                      value: string
                    - name: int64Value
                      value: string
                    - name: floatValue
                      value: number
                    - name: javaClassValue
                      value: string
                    - name: timestampValue
                      value: string
                    - name: durationValue
                      value: string
                    - name: boolValue
                      value: boolean
                    - name: shortStrValue
                      value: string
                    - name: url
                      value: string
                    - name: label
                      value: string
              - name: outputCollectionName
                value:
                  - string
              - name: inputCollectionName
                value:
                  - string
        - name: executionPipelineStage
          value:
            - - name: name
                value: string
              - name: id
                value: string
              - name: kind
                value: string
              - name: inputSource
                value:
                  - - name: userName
                      value: string
                    - name: name
                      value: string
                    - name: originalTransformOrCollection
                      value: string
                    - name: sizeBytes
                      value: string
              - name: outputSource
                value:
                  - - name: userName
                      value: string
                    - name: name
                      value: string
                    - name: originalTransformOrCollection
                      value: string
                    - name: sizeBytes
                      value: string
              - name: prerequisiteStage
                value:
                  - string
              - name: componentTransform
                value:
                  - - name: userName
                      value: string
                    - name: name
                      value: string
                    - name: originalTransform
                      value: string
              - name: componentSource
                value:
                  - - name: userName
                      value: string
                    - name: name
                      value: string
                    - name: originalTransformOrCollection
                      value: string
        - name: displayData
          value:
            - - name: key
                value: string
              - name: namespace
                value: string
              - name: strValue
                value: string
              - name: int64Value
                value: string
              - name: floatValue
                value: number
              - name: javaClassValue
                value: string
              - name: timestampValue
                value: string
              - name: durationValue
                value: string
              - name: boolValue
                value: boolean
              - name: shortStrValue
                value: string
              - name: url
                value: string
              - name: label
                value: string
        - name: stepNamesHash
          value: string
    - name: stageStates
      value:
        - - name: executionStageName
            value: string
          - name: executionStageState
            value: string
          - name: currentStateTime
            value: string
    - name: jobMetadata
      value:
        - name: sdkVersion
          value:
            - name: version
              value: string
            - name: versionDisplayName
              value: string
            - name: sdkSupportStatus
              value: string
            - name: bugs
              value:
                - - name: type
                    value: string
                  - name: severity
                    value: string
                  - name: uri
                    value: string
        - name: spannerDetails
          value:
            - - name: projectId
                value: string
              - name: instanceId
                value: string
              - name: databaseId
                value: string
        - name: bigqueryDetails
          value:
            - - name: table
                value: string
              - name: dataset
                value: string
              - name: projectId
                value: string
              - name: query
                value: string
        - name: bigTableDetails
          value:
            - - name: projectId
                value: string
              - name: instanceId
                value: string
              - name: tableId
                value: string
        - name: pubsubDetails
          value:
            - - name: topic
                value: string
              - name: subscription
                value: string
        - name: fileDetails
          value:
            - - name: filePattern
                value: string
        - name: datastoreDetails
          value:
            - - name: namespace
                value: string
              - name: projectId
                value: string
        - name: userDisplayProperties
          value: object
    - name: startTime
      value: string
    - name: createdFromSnapshotId
      value: string
    - name: satisfiesPzs
      value: boolean
    - name: runtimeUpdatableParams
      value:
        - name: maxNumWorkers
          value: integer
        - name: minNumWorkers
          value: integer
        - name: workerUtilizationHint
          value: number
    - name: satisfiesPzi
      value: boolean
    - name: serviceResources
      value:
        - name: zones
          value:
            - string

`REPLACE` example

Replaces all fields in the specified jobs resource.

/*+ update */
REPLACE google.dataflow.jobs
SET 
projectId = '{{ projectId }}',
name = '{{ name }}',
type = '{{ type }}',
environment = '{{ environment }}',
steps = '{{ steps }}',
stepsLocation = '{{ stepsLocation }}',
currentState = '{{ currentState }}',
currentStateTime = '{{ currentStateTime }}',
requestedState = '{{ requestedState }}',
executionInfo = '{{ executionInfo }}',
replaceJobId = '{{ replaceJobId }}',
transformNameMapping = '{{ transformNameMapping }}',
clientRequestId = '{{ clientRequestId }}',
replacedByJobId = '{{ replacedByJobId }}',
tempFiles = '{{ tempFiles }}',
labels = '{{ labels }}',
location = '{{ location }}',
pipelineDescription = '{{ pipelineDescription }}',
stageStates = '{{ stageStates }}',
jobMetadata = '{{ jobMetadata }}',
startTime = '{{ startTime }}',
createdFromSnapshotId = '{{ createdFromSnapshotId }}',
satisfiesPzs = true|false,
runtimeUpdatableParams = '{{ runtimeUpdatableParams }}'
WHERE 
jobId = '{{ jobId }}'
AND projectId = '{{ projectId }}';

Overview​

Fields​

Methods​

SELECT examples​

INSERT example​

REPLACE example​