S3 バケット作成
$ aws s3 mb s3://20200505-transcribe-input-mfjt
make_bucket: 20200505-transcribe-input-mfjt
$ aws s3 mb s3://20200505-transcribe-output-mfjt
make_bucket: 20200505-transcribe-output-mfjt
Lambda Function 作成
OutpuBucketName
は自分のバケット名に変更しておく。
import json
import urllib.parse
import boto3
import datetime
s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
try:
transcribe.start_transcription_job(
TranscriptionJobName= datetime.datetime.now().strftime("%Y%m%d%H%M%S") + '_Transcription',
LanguageCode='en-US',
Media={
'MediaFileUri': 's3://' + bucket + '/' + key
},
OutputBucketName='20200505-transcribe-output-mfjt'
)
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
ロールは相変わらず流用で。
$ zip transcribe-function.zip transcribe-function.py
adding: transcribe-function.py (deflated 48%)
$ aws lambda create-function \
--function-name transcribe-function \
--runtime python3.8 \
--role "arn:aws:iam::760182048326:role/translate-function-role" \
--handler transcribe-function.lambda_handler \
--zip-file fileb://transcribe-function.zip
{
"FunctionName": "transcribe-function",
"FunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function",
"Runtime": "python3.8",
"Role": "arn:aws:iam::760182048326:role/translate-function-role",
"Handler": "transcribe-function.lambda_handler",
"CodeSize": 667,
"Description": "",
"Timeout": 3,
"MemorySize": 128,
"LastModified": "2020-05-05T08:36:23.606+0000",
"CodeSha256": "Ce79ystMNCxNtMV5JNg1cBs/wsPQZNBgU/EQ9y7ftIk=",
"Version": "$LATEST",
"TracingConfig": {
"Mode": "PassThrough"
},
"RevisionId": "0406316d-82c6-4c2e-b315-d56b5e1f9ddd",
"State": "Active",
"LastUpdateStatus": "Successful"
}
イベント作成
マネジメントコンソールからの場合は設計図の使用からで大丈夫っぽいけど、CLI だと細かく設定が必要らしい。
Lambda Function への権限設定
$ aws lambda add-permission \
--function-name transcribe-function \
--statement-id s3-put-event \
--action lambda:InvokeFunction \
--principal s3.amazonaws.com \
--source-arn arn:aws:s3:::20200505-transcribe-input-mfjt
{
"Statement": "{\"Sid\":\"s3-put-event\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function\",\"Condition\":{\"ArnLike\":{\"AWS:SourceArn\":\"arn:aws:s3:::20200505-transcribe-input-mfjt\"}}}"
}
$ aws lambda get-policy --function-name transcribe-function
{
"Policy": "{\"Version\":\"2012-10-17\",\"Id\":\"default\",\"Statement\":[{\"Sid\":\"s3-put-event\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function\",\"Condition\":{\"ArnLike\":{\"AWS:SourceArn\":\"arn:aws:s3:::20200505-transcribe-input-mfjt\"}}}]}",
"RevisionId": "be76518e-5680-4436-8c7a-6bc36ee78573"
}
S3 へのイベント追加
$ aws s3api put-bucket-notification-configuration \
--bucket 20200505-transcribe-input-mfjt \
--notification-configuration '{"LambdaFunctionConfigurations": [{"LambdaFunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function", "Events": ["s3:ObjectCreated:Put"]}]}'
$ aws s3api get-bucket-notification-configuration --bucket 20200505-transcribe-input-mfjt
{
"LambdaFunctionConfigurations": [
{
"Id": "MTlhZThmYTItZGE3Mi00ZTM1LWE2YzktZWFkYjRjM2JmYmRk",
"LambdaFunctionArn": "arn:aws:lambda:ap-northeast-1:760182048326:function:transcribe-function",
"Events": [
"s3:ObjectCreated:Put"
]
}
]
}
ポリシーのアタッチ
$ aws iam attach-role-policy \
--role-name translate-function-role \
--policy-arn "arn:aws:iam::aws:policy/AmazonS3FullAccess"
$ aws iam attach-role-policy \
--role-name translate-function-role \
--policy-arn "arn:aws:iam::aws:policy/AmazonTranscribeFullAccess"
動作確認
$ aws s3 cp HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3 s3://20200505-transcribe-input-mfjt
upload: ./HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3 to s3://20200505-transcribe-input-mfjt/HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3
$ aws transcribe list-transcription-jobs
{
"TranscriptionJobSummaries": [
{
"TranscriptionJobName": "20200505090252_Transcription",
"CreationTime": "2020-05-05T18:02:53.280000+09:00",
"StartTime": "2020-05-05T18:02:53.325000+09:00",
"CompletionTime": "2020-05-05T18:04:52.861000+09:00",
"LanguageCode": "en-US",
"TranscriptionJobStatus": "COMPLETED",
"OutputLocationType": "CUSTOMER_BUCKET"
}
]
}
$ aws transcribe get-transcription-job --transcription-job-name 20200505090252_Transcription
{
"TranscriptionJob": {
"TranscriptionJobName": "20200505090252_Transcription",
"TranscriptionJobStatus": "COMPLETED",
"LanguageCode": "en-US",
"MediaSampleRateHertz": 44100,
"MediaFormat": "mp3",
"Media": {
"MediaFileUri": "s3://20200505-transcribe-input-mfjt/HelloEnglish-Joanna.0aa7a6dc7f1de9ac48769f366c6f447f9051db57.mp3"
},
"Transcript": {
"TranscriptFileUri": "https://s3.ap-northeast-1.amazonaws.com/20200505-transcribe-output-mfjt/20200505090252_Transcription.json"
},
"StartTime": "2020-05-05T18:02:53.325000+09:00",
"CreationTime": "2020-05-05T18:02:53.280000+09:00",
"CompletionTime": "2020-05-05T18:04:52.861000+09:00",
"Settings": {
"ChannelIdentification": false,
"ShowAlternatives": false
}
}
}
$ aws s3 cp s3://20200505-transcribe-output-mfjt/20200505090252_Transcription.json - | jq '.results | .transcripts'
[
{
"transcript": "Hello. Do you speak a foreign language? One language is never enough."
}
]