ハンズオン 1 編ハンズオン 2 編の続き。

S3 バケット作成

$ aws s3 mb s3://20200505-transcribe-input-mfjt
make_bucket: 20200505-transcribe-input-mfjt

$ aws s3 mb s3://20200505-transcribe-output-mfjt
make_bucket: 20200505-transcribe-output-mfjt

Lambda Function 作成

OutpuBucketName は自分のバケット名に変更しておく。

import json
import urllib.parse
import boto3
import datetime

s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')

def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:
        transcribe.start_transcription_job(
            TranscriptionJobName= datetime.datetime.now().strftime("%Y%m%d%H%M%S") + '_Transcription',
            LanguageCode='en-US',
            Media={
                'MediaFileUri': 's3://' + bucket + '/' + key
            },
            OutputBucketName='20200505-transcribe-output-mfjt'
        )
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

ロールは相変わらず流用で。

$ zip transcribe-function.zip transcribe-function.py
  adding: transcribe-function.py (deflated 48%)

$ aws lambda create-function \
    --function-name transcribe-function \
    --runtime python3.8 \
    --role "arn:aws:iam::760182048326:role/translate-function-role" \
    --handler transcribe-function.lambda_handler \
    --zip-file fileb://transcribe-function.zip
{
    "FunctionName": "transcribe-function",
    "FunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function",
    "Runtime": "python3.8",
    "Role": "arn:aws:iam::760182048326:role/translate-function-role",
    "Handler": "transcribe-function.lambda_handler",
    "CodeSize": 667,
    "Description": "",
    "Timeout": 3,
    "MemorySize": 128,
    "LastModified": "2020-05-05T08:36:23.606+0000",
    "CodeSha256": "Ce79ystMNCxNtMV5JNg1cBs/wsPQZNBgU/EQ9y7ftIk=",
    "Version": "$LATEST",
    "TracingConfig": {
        "Mode": "PassThrough"
    },
    "RevisionId": "0406316d-82c6-4c2e-b315-d56b5e1f9ddd",
    "State": "Active",
    "LastUpdateStatus": "Successful"
}

イベント作成

マネジメントコンソールからの場合は設計図の使用からで大丈夫っぽいけど、CLI だと細かく設定が必要らしい。

Lambda Function への権限設定

$ aws lambda add-permission \
    --function-name transcribe-function \
    --statement-id s3-put-event \
    --action lambda:InvokeFunction \
    --principal s3.amazonaws.com \
    --source-arn arn:aws:s3:::20200505-transcribe-input-mfjt
{
    "Statement": "{\"Sid\":\"s3-put-event\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function\",\"Condition\":{\"ArnLike\":{\"AWS:SourceArn\":\"arn:aws:s3:::20200505-transcribe-input-mfjt\"}}}"
}

$ aws lambda get-policy --function-name transcribe-function
{
    "Policy": "{\"Version\":\"2012-10-17\",\"Id\":\"default\",\"Statement\":[{\"Sid\":\"s3-put-event\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function\",\"Condition\":{\"ArnLike\":{\"AWS:SourceArn\":\"arn:aws:s3:::20200505-transcribe-input-mfjt\"}}}]}",
    "RevisionId": "be76518e-5680-4436-8c7a-6bc36ee78573"
}

S3 へのイベント追加

$ aws s3api put-bucket-notification-configuration \
    --bucket 20200505-transcribe-input-mfjt \
    --notification-configuration '{"LambdaFunctionConfigurations": [{"LambdaFunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function", "Events": ["s3:ObjectCreated:Put"]}]}'

$ aws s3api get-bucket-notification-configuration --bucket 20200505-transcribe-input-mfjt
{
    "LambdaFunctionConfigurations": [
        {
            "Id": "MTlhZThmYTItZGE3Mi00ZTM1LWE2YzktZWFkYjRjM2JmYmRk",
            "LambdaFunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function",
            "Events": [
                "s3:ObjectCreated:Put"
            ]
        }
    ]
}

ポリシーのアタッチ

$ aws iam attach-role-policy \
    --role-name translate-function-role \
    --policy-arn "arn:aws:iam::aws:policy/AmazonS3FullAccess"

$ aws iam attach-role-policy \
    --role-name translate-function-role \
    --policy-arn "arn:aws:iam::aws:policy/AmazonTranscribeFullAccess"

動作確認

$ aws s3 cp HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3 s3://20200505-transcribe-input-mfjt
upload: ./HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3 to s3://20200505-transcribe-input-mfjt/HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3

$ aws transcribe list-transcription-jobs
{
    "TranscriptionJobSummaries": [
        {
            "TranscriptionJobName": "20200505090252_Transcription",
            "CreationTime": "2020-05-05T18:02:53.280000+09:00",
            "StartTime": "2020-05-05T18:02:53.325000+09:00",
            "CompletionTime": "2020-05-05T18:04:52.861000+09:00",
            "LanguageCode": "en-US",
            "TranscriptionJobStatus": "COMPLETED",
            "OutputLocationType": "CUSTOMER_BUCKET"
        }
    ]
}

$ aws transcribe get-transcription-job --transcription-job-name 20200505090252_Transcription
{
    "TranscriptionJob": {
        "TranscriptionJobName": "20200505090252_Transcription",
        "TranscriptionJobStatus": "COMPLETED",
        "LanguageCode": "en-US",
        "MediaSampleRateHertz": 44100,
        "MediaFormat": "mp3",
        "Media": {
            "MediaFileUri": "s3://20200505-transcribe-input-mfjt/HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3"
        },
        "Transcript": {
            "TranscriptFileUri": "https://s3.ap-northeast-1.amazonaws.com/20200505-transcribe-output-mfjt/20200505090252_Transcription.json"
        },
        "StartTime": "2020-05-05T18:02:53.325000+09:00",
        "CreationTime": "2020-05-05T18:02:53.280000+09:00",
        "CompletionTime": "2020-05-05T18:04:52.861000+09:00",
        "Settings": {
            "ChannelIdentification": false,
            "ShowAlternatives": false
        }
    }
}

$ aws s3 cp s3://20200505-transcribe-output-mfjt/20200505090252_Transcription.json - | jq '.results | .transcripts'
[
  {
    "transcript": "Hello. Do you speak a foreign language? One language is never enough."
  }
]

参考