Criando um gerador de vídeos curtos legendados com IA

Introdução

Ferramenta projetada para automatizar a criação de vídeos curtos, aproveitando as tecnologias de IA. Ela simplifica o processo de transformação de um prompt de texto em um vídeo completo, incluindo roteiro, narração e montagem de vídeo.

Este artigo fornece um passo a passo de como o aplicativo foi criado.

Pré-requisitos

Para configurar e replicar o AI Shorts Video Generator, verifique se você atende aos seguintes requisitos:

Requisitos do sistema

Node.js: Versão 22.x ou superior.
.NET SDK: Versão 9 ou superior.
PostgreSQL: Para gerenciamento de banco de dados (por exemplo, Neon Serverless Postgres).
Contas na:
- Cloudinary para armazenamento e processamento de mídia.
- Cloudflare Workers AI para geração de imagens (por exemplo, flux-1-schnell).
- Google Cloud Text-to-Speech para síntese de áudio.
- AssemblyAI para geração de legendas.
- Gemini API para geração do conteúdo do vídeo

Dependências

Para os componentes de front-end e back-end, são usadas as seguintes ferramentas e bibliotecas:

Frontend:
- Next.js, TailwindCSS, Remotion, Axios e componentes shadcn.
Backend:
- Google.Cloud.TextToSpeech.V1, AssemblyAI SDK, CloudinaryDotNet SDK e Npgsql para PostgreSQL.

Variáveis de ambiente

Configure as seguintes variáveis de ambiente para permitir a integração adequada com APIs e serviços:

Frontend (.env):

1NEXT_PUBLIC_API_URL=<sua URL de API>
2REMOTION_AWS_SERVE_URL=<Sua URL do AWS Serve>
3REMOTION_AWS_BUCKET_NAME=<Seu nome de bucket do AWS>

Backend (appsettings.Development.json):

1{
2  “GoogleApi": {
3    “GeminiKey": “your-gemini-api-key”,
4    “TextToSpeechKey": “your-text-to-speech-api-key”
5  },
6  “AssemblyAi": {
7    “ApiKey": “your-assemblyai-api-key”
8  },
9  “CloudinaryUrl": “your-cloudinary-url”,
10  “Cloudflare": {
11    “ApiKey": “your-cloudflare-api-key”,
12    “AccountId": “your-cloudflare-account-id”
13  },
14  “ConnectionStrings": {
15    “DefaultConnection” (Conexão Padrão): “your-postgresql-connection-string”
16  }
17}

Conhecimento recomendado

Familiaridade com React e Next.js para desenvolvimento de front-end.
Conhecimento básico de .NET Core para desenvolvimento de back-end.
Experiência com APIs.

Visão geral da arquitetura

O AI Shorts Video Generator foi projetado para simplificar a criação de vídeos curtos aproveitando vários serviços de IA. Esta seção descreve o fluxo e os principais componentes do sistema.

Flow Chart

Input do usuário e geração de conteúdo de vídeo
- Os usuários começam preenchendo um formulário para criar um vídeo curto. Esse formulário captura o tópico, o estilo e a duração desejados do vídeo.
- Usando o Google Gemini, o aplicativo gera o script e os prompts de imagem associados necessários para o conteúdo do vídeo.
Geração de áudio
- O script gerado é convertido em áudio usando a Google Text-to-Speech API.
- O arquivo MP3 resultante é armazenado no Cloudinary para facilitar o acesso e a integração ao vídeo.
Geração de legendas
- O arquivo de áudio MP3 é processado usando AssemblyAI para gerar legendas.
- Essas legendas são armazenadas em um banco de dados PostgreSQL para recuperação futura e sincronização de vídeo.
Geração de imagens
- Com base nos prompts de imagem gerados pelo Google Gemini, o aplicativo cria recursos visuais relevantes usando um modelo de texto para imagem.
- Essas imagens também são armazenadas no Cloudinary.
Compilação de vídeo
- O vídeo final é compilado usando o Remotion, incorporando o script, o áudio, as imagens geradas e as legendas.
- O vídeo concluído fica disponível para visualização, exportação ou exclusão, conforme a necessidade do usuário.

Tecnologias-chave

Frontend: Desenvolvido com Next.js, fornecendo uma interface de usuário perfeita para a criação e o gerenciamento de vídeos.
Backend: Desenvolvido usando o ASP.NET Core, manipulando solicitações de API e integrando-se a vários serviços de IA.
Banco de dados: Utiliza o PostgreSQL para armazenar metadados, legendas e outros dados essenciais.
Serviços de nuvem:
- Cloudinary: Gerencia o armazenamento e a recuperação de mídia para áudio e imagens.
- Google Cloud: Oferece recursos de IA como Gemini para geração de conteúdo e Text-to-Speech para síntese de áudio.
- Cloudflare Workers AI: Facilita a criação de imagens a partir de solicitações de texto.
- AssemblyAI: Gera legendas para melhorar a acessibilidade.

Implementação do backend passo a passo

Esta seção o orientará na configuração e implementação do AI Shorts Video Generator a partir do zero. Abordaremos os componentes de frontend, backend, integrações de IA e renderização de vídeo.

1. Criar projeto backend

1dotnet new webapi --name backend

2. Criar o projeto frontend

1npx create-next-app@latest

Certifique-se de selecionar Tailwind CSS, selecione os outros prompts com base em suas preferências pessoais.

3. Geração de conteúdo de vídeo usando o Google Gemini

O endpoint da API /generate-content recebe o input do usuário e gera um script junto com prompts de imagem.

3.1 Obtenha a chave da API do Gemini em sua docs.

3.2 Configure sua chave de API adicionando-a ao arquivo appsettings.Development.json:

1{
2  “GoogleApi": {
3    “GeminiKey": “your-gemini-api-key”
4  }
5}
6

3.3 Crie uma classe de serviço para a Gemini API.

1using System;
2using System.Net.Http;
3using System.Text;
4using System.Text.Json;
5using System.Threading.Tasks;
6using Models;
7
8public class GeminiApiService(HttpClient httpClient, IConfiguration configuration)
9{
10    public async Task<List<VideoContentItem>> CallGoogleApi(string input)
11    {
12        if (string.IsNullOrWhiteSpace(input))
13        {
14            throw new ArgumentException("User input cannot be null or empty", nameof(input));
15        }
16
17        var apiKey = configuration["GoogleApi:GeminiKey"];
18
19        if (string.IsNullOrEmpty(apiKey))
20        {
21            throw new InvalidOperationException("API key is not configured");
22        }
23
24        string url = $"<https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key={apiKey}>";
25
26        string requestBody = $@"{{
27            ""contents"": [
28                {{
29                    ""role"": ""user"",
30                    ""parts"": [
31                        {{
32                            ""text"": ""{input}""
33                        }}
34                    ]
35                }}
36            ],
37            ""generationConfig"": {{
38                ""temperature"": 1,
39                ""topK"": 40,
40                ""topP"": 0.95,
41                ""maxOutputTokens"": 8192,
42                ""responseMimeType"": ""application/json""
43            }}
44        }}";
45
46        var content = new StringContent(requestBody, Encoding.UTF8, "application/json");
47        var response = await httpClient.PostAsync(url, content);
48
49        if (response.IsSuccessStatusCode)
50        {
51            var responseString = await response.Content.ReadAsStringAsync();
52            if (!string.IsNullOrEmpty(responseString))
53            {
54                var dataJson = JsonSerializer.Deserialize<JsonElement>(responseString);
55
56                if (dataJson.TryGetProperty("candidates", out var candidates))
57                {
58                    var contentText = candidates[0]
59                        .GetProperty("content")
60                        .GetProperty("parts")[0]
61                        .GetProperty("text")
62                        .GetString();
63
64                    if (!string.IsNullOrEmpty(contentText))
65                    {
66                        var videoContentList = JsonSerializer.Deserialize<List<VideoContentItem>>(contentText);
67
68                        if (videoContentList != null)
69                        {
70                            return videoContentList;
71                        }
72                    }
73                }
74            }
75        }
76
77        var errorContent = await response.Content.ReadAsStringAsync();
78        throw new Exception($"Error calling Google API: {response.StatusCode}, Content: {errorContent}");
79    }
80}

Adicione ao Program.cs:

1builder.Services.AddHttpClient<GeminiApiService>();

3.4 Configure a chave da API atualizando o arquivo appsettings.Development.json:

1{
2  “GoogleApi": {
3    “GeminiKey": “your-gemini-api-key”
4  }
5}

3.5 Crie o enpoint /generate-content:

1app.MapPost(“/generate-content”, async (GeminiApiService googleApiService, [FromBody] JsonElement body) =>
2    {
3        try
4        {
5            se (!body.TryGetProperty(“input”, out var userInputJson) || string.IsNullOrWhiteSpace(userInputJson.GetString()))
6            {
7                return Results.BadRequest(new { success = false, message = “O parâmetro obrigatório ‘input’ está faltando ou é inválido”. });
8            }
9
10            var apiKey = builder.Configuration[“GoogleApi:GeminiKey”];
11            se (string.IsNullOrEmpty(apiKey))
12            {
13                return Results.BadRequest(new { success = false, message = “A chave da API está faltando ou não está configurada.” });
14            }
15
16            var userInput = userInputJson.GetString()!
17            var result = await googleApiService.CallGoogleApi(userInput);
18
19            return Results.Ok(result);
20        }
21        catch (Exception ex)
22        {
23            return Results.BadRequest(new { success = false, message = ex.Message });
24        }
25    })
26    .WithName(“GenerateContent”)
27    .Produces<List<VideoContentItem>>()
28    .Produces(400);
29

4. Conversão de texto em fala

O endpoint /generate-audio converte o script em um arquivo MP3

4.1 Crie um projeto no Google Cloud e ative a Google Cloud Text-to-Speech AI

4.2 Crie uma chave de API: Pesquise Google Cloud Text-to-Speech > Manage > Credentials > CREATE CREDENTIALS > API key

4.3 Adicione os pacotes NuGet Google.Cloud.TextToSpeech.V1 e CloudinaryDotNet ao projeto:

1dotnet add pacote Google.Cloud.TextToSpeech.V1
2dotnet add package CloudinaryDotNet

4.4 Atualize o appsettings.Development.json:

1{
2  “GoogleApi": {
3    “GeminiKey": “your-gemini-api-key”,
4    “TextToSpeechKey": “your-text-to-speech-api-key”
5  },
6  “CloudinaryUrl": “your-cloudinary-url”,
7}
8

4.5 Crie uma classe de serviço para a API Text-to-Speech do Google Cloud

1using Google.Cloud.TextToSpeech.V1;
2
3namespace AiShortsGenerator.Services;
4
5public class TextToSpeechService(IConfiguration configuration)
6{
7    public async Task<byte[]> SynthesizeTextToSpeech(string inputText)
8    {
9        if (string.IsNullOrWhiteSpace(inputText))
10        {
11            throw new ArgumentException("Input text cannot be null or empty", nameof(inputText));
12        }
13
14        var client = await new TextToSpeechClientBuilder
15        {
16            ApiKey = configuration["GoogleApi:TextToSpeechKey"]
17        }.BuildAsync();
18
19        var input = new SynthesisInput
20        {
21            Text = inputText
22        };
23
24        var voiceSelection = new VoiceSelectionParams
25        {
26            LanguageCode = "en-US",
27            SsmlGender = SsmlVoiceGender.Neutral
28        };
29
30        var audioConfig = new AudioConfig
31        {
32            AudioEncoding = AudioEncoding.Mp3
33        };
34
35        var response = await client.SynthesizeSpeechAsync(input, voiceSelection, audioConfig);
36        return response.AudioContent.ToArray();
37    }
38}

Adicione ao Program.cs:

1builder.Services.AddScoped<TextToSpeechService>();

4.6 Crie uma classe de serviço para a API do Cloudinary:

Precisamos que o arquivo de áudio MP3 esteja acessível na Internet para a geração de legendas com o Assembly AI.

1using CloudinaryDotNet;
2using CloudinaryDotNet.Actions;
3
4namespace AiShortsGenerator.Services;
5
6public class CloudinaryService(IConfiguration configuration)
7{
8    private readonly Cloudinary _cloudinary = new(configuration["CloudinaryUrl"]);
9
10    public async Task<string> UploadAudio(byte[] audioContent)
11    {
12        var uploadParams = new AutoUploadParams
13        {
14            File = new FileDescription(Guid.NewGuid().ToString(), new MemoryStream(audioContent)),
15            Folder = "audio-files"
16        };
17
18        var uploadResult = await _cloudinary.UploadAsync(uploadParams);
19
20        if (uploadResult.StatusCode == System.Net.HttpStatusCode.OK)
21        {
22            return uploadResult.SecureUrl.ToString();
23        }
24
25        throw new Exception("Audio upload failed: " + uploadResult.Error?.Message);
26    }
27
28    public async Task<string> UploadImage(byte[] imageContent)
29    {
30        var uploadParams = new ImageUploadParams
31        {
32            File = new FileDescription(Guid.NewGuid().ToString(), new MemoryStream(imageContent)),
33            Folder = "image-files"
34        };
35
36        var uploadResult = await _cloudinary.UploadAsync(uploadParams);
37
38        if (uploadResult.StatusCode == System.Net.HttpStatusCode.OK)
39        {
40            return uploadResult.SecureUrl.ToString();
41        }
42
43        throw new Exception("Image upload failed: " + uploadResult.Error?.Message);
44    }
45}

Adicione ao Program.cs:

1builder.Services.AddSingleton<CloudinaryService>();

4.7 Crie o endpoint /generate-audio:

1app.MapPost("/generate-audio", async (TextToSpeechService textToSpeechService, CloudinaryService cloudinary, [FromBody] JsonElement body) =>
2    {
3        if (!body.TryGetProperty("input", out var inputJson) || string.IsNullOrWhiteSpace(inputJson.GetString()))
4        {
5            return Results.BadRequest(new { success = false, message = "Required parameter 'input' is missing or invalid." });
6        }
7
8        var input = inputJson.GetString()!;
9        try
10        {
11            var mp3Data = await textToSpeechService.SynthesizeTextToSpeech(input);
12            var audioUrl = await cloudinary.UploadAudio(mp3Data);
13            return Results.Ok(audioUrl);
14        }
15        catch (Exception ex)
16        {
17            return Results.Problem(detail: ex.Message);
18        }
19    })
20    .WithName("GenerateAudio")
21    .Produces<string>()
22    .Produces(400);

5. Geração de legendas usando o AssemblyAI

O ponto de extremidade /generate-captions processa o arquivo MP3 e cria legendas.

5.1 Crie uma conta no site do AssemblyAI e obtenha sua chave de API

5.2 Atualize o arquivo appsettings.Development.json:

1{
2  "GoogleApi": {
3    "GeminiKey": "your-gemini-api-key",
4    "TextToSpeechKey": "your-text-to-speech-api-key"
5  },
6  "AssemblyAi": {
7    "ApiKey": "your-assemblyai-api-key"
8  }
9}

5.3 Adicione o AssemblyAI NuGet package ao projeto:

1dotnet add package AssemblyAI

5.4 Crie uma classe de serviço para o AssemblyAI:

1using AssemblyAI;
2using AssemblyAI.Transcripts;
3
4namespace AiShortsGenerator.Services;
5
6public class AssemblyAiService(IConfiguration configuration)
7{
8    public async Task<IEnumerable<TranscriptWord>> Transcribe(string fileUrl)
9    {
10        var apiKey = configuration["AssemblyAi:ApiKey"];
11
12        if (string.IsNullOrEmpty(apiKey))
13        {
14            throw new InvalidOperationException("API key is not configured");
15        }
16
17        var client = new AssemblyAIClient(apiKey);
18
19        var transcriptParams = new TranscriptParams
20        {
21            AudioUrl = fileUrl
22        };
23
24        var transcript = await client.Transcripts.TranscribeAsync(transcriptParams);
25
26        transcript.EnsureStatusCompleted();
27
28        if (transcript.Words == null)
29        {
30            throw new InvalidOperationException("Transcript contains no words.");
31        }
32
33        return transcript.Words;
34    }
35}

Adicione ao Program.cs:

1builder.Services.AddScoped<AssemblyAiService>();

5.5 Crie o endpoint /generate-captions:

1app.MapPost("/generate-captions", async (AssemblyAiService assemblyAiService, [FromBody] JsonElement body) =>
2    {
3        if (!body.TryGetProperty("fileUrl", out var inputJson) || string.IsNullOrWhiteSpace(inputJson.GetString()))
4        {
5            return Results.BadRequest(new { success = false, message = "Required parameter 'fileUrl' is missing or invalid." });
6        }
7
8        var fileUrl = inputJson.GetString()!;
9        try
10        {
11            var transcript = await assemblyAiService.Transcribe(fileUrl);
12
13            return Results.Ok(transcript);
14        }
15        catch (Exception ex)
16        {
17            return Results.BadRequest(new { success = false, message = ex.Message });
18        }
19    })
20    .WithName("GenerateCaptions")
21    .Produces<string>()
22    .Produces(400);
23

6. Geração de imagens usando a IA do Cloudflare Workers

O endpoint /generate-image converte prompts de texto em imagens. 6.1 Obtenha a chave da API do Cloudflare e o ID da conta necessários para usar o Workers AI.

6.2 Atualize o arquivo appsettings.Development.json:

1{
2  "GoogleApi": {
3    "GeminiKey": "your-gemini-api-key",
4    "TextToSpeechKey": "your-text-to-speech-api-key"
5  },
6  "AssemblyAi": {
7    "ApiKey": "your-assemblyai-api-key"
8  },
9  "CloudinaryUrl": "your-cloudinary-url",
10  "Cloudflare": {
11    "ApiKey": "your-cloudflare-api-key",
12    "AccountId": "your-cloudflare-account-id"
13  }
14}

6.3 Crie uma classe para a resposta da API do Cloudflare e a requisição do endpoint:

1namespace AiShortsGenerator.DTOs
2{
3    public class CloudflareApiResponse
4    {
5        public CloudflareResult Result { get; set; }
6        public bool Success { get; set; }
7        public List<string> Errors { get; set; }
8        public List<string> Messages { get; set; }
9    }
10
11    public class CloudflareResult
12    {
13        public string Image { get; set; }
14    }
15}

1namespace AiShortsGenerator.DTOs;
2
3public class GenerateImageRequest
4{
5    public string Prompt { get; set; }
6}
7

6.4 Crie a classe de serviço para a API do Cloudflare Workers AI:

1using System.Net.Http.Headers;
2using System.Text;
3using System.Text.Json;
4using AiShortsGenerator.DTOs;
5
6namespace AiShortsGenerator.Services;
7
8public class CloudflareApiService
9{
10    private readonly HttpClient _httpClient;
11    private readonly string _apiUrl;
12
13    public CloudflareApiService(HttpClient httpClient, IConfiguration configuration)
14    {
15        _httpClient = httpClient;
16        _apiUrl = $"<https://api.cloudflare.com/client/v4/accounts/{configuration["Cloudflare:AccountId"]}/ai/run/@cf/black-forest-labs/flux-1-schnell>";
17
18        var apiKey = configuration["Cloudflare:ApiKey"];
19        if (string.IsNullOrEmpty(apiKey))
20        {
21            throw new ArgumentException("Cloudflare API key is missing in configuration.");
22        }
23
24        _httpClient.DefaultRequestHeaders.Authorization =
25            new AuthenticationHeaderValue("Bearer", apiKey);
26    }
27
28    public async Task<byte[]> GenerateImageAsync(string prompt)
29    {
30        var payload = new
31        {
32            prompt
33        };
34
35        var jsonPayload = JsonSerializer.Serialize(payload);
36        var content = new StringContent(jsonPayload, Encoding.UTF8, "application/json");
37
38        var response = await _httpClient.PostAsync(_apiUrl, content);
39
40        if (!response.IsSuccessStatusCode)
41        {
42            var errorMessage = await response.Content.ReadAsStringAsync();
43            throw new HttpRequestException($"Cloudflare API request failed with status code {response.StatusCode}: {errorMessage}");
44        }
45
46        var responseContent = await response.Content.ReadFromJsonAsync<CloudflareApiResponse>();
47
48        if (responseContent is not { Success: true })
49        {
50            var errors = string.Join("; ", responseContent?.Errors ?? ["Unknown error"]);
51            throw new HttpRequestException($"Failed to generate image: {errors}");
52        }
53
54        if (responseContent.Result.Image == null)
55        {
56            throw new HttpRequestException("Failed to generate image: No image returned by Cloudflare API.");
57        }
58
59        if (!(responseContent.Messages.Count > 0))
60        {
61            return Convert.FromBase64String(responseContent.Result.Image);
62        }
63
64        Console.WriteLine("Cloudflare API Messages:");
65        foreach (var message in responseContent.Messages)
66        {
67            Console.WriteLine($"- {message}");
68        }
69
70        return Convert.FromBase64String(responseContent.Result.Image);
71    }
72}
73

Adicione ao Program.cs:

1builder.Services.AddHttpClient<CloudflareApiService>();

6.5 Crie o endpoint /generate-image:

1app.MapPost("/generate-image", async (CloudflareApiService cloudflareApiService, CloudinaryService cloudinary, [FromBody] GenerateImageRequest request) =>
2{
3    if (string.IsNullOrWhiteSpace(request.Prompt))
4    {
5        return Results.BadRequest("Prompt is required.");
6    }
7
8    try
9    {
10        var imageBytes = await cloudflareApiService.GenerateImageAsync(request.Prompt);
11        var imageUrl = await cloudinary.UploadImage(imageBytes);
12        return Results.Ok(imageUrl);
13    }
14    catch (HttpRequestException ex)
15    {
16        return Results.Problem(ex.Message, statusCode: 500);
17    }
18})
19.WithName("GenerateImage")
20.Produces<string>()
21.Produces(400)
22.Produces(500);

7. Gerenciamento de vídeos no banco de dados

Depois que o conteúdo gerado pela IA (imagens, áudio, legendas) é processado, os vídeos precisam ser armazenados, atualizados e recuperados do banco de dados.

7.1 Crie o model de vídeo:

1using System.Text.Json.Serialization;
2
3namespace AiShortsGenerator.Models;
4
5public class Video
6{
7    public int Id { get; set; }
8
9    [JsonInclude]
10    public List<VideoContentItem> VideoContent { get; set; } = [];
11
12    public string AudioFileUrl { get; set; }
13
14    [JsonInclude]
15    public List<TranscriptSegment> Captions { get; set; } = [];
16
17    public List<string> Images { get; set; }
18
19    public DateTime CreatedAt { get; set; } = DateTime.UtcNow;
20    public string? OutputFile { get; set; }
21    public string? RenderId { get; set; }
22}

1namespace AiShortsGenerator.Models;
2
3public class VideoContentItem(string imagePrompt, string contextText)
4{
5    public string ImagePrompt { get; set; } = imagePrompt;
6    public string ContextText { get; set; } = contextText;
7}
8

1namespace AiShortsGenerator.Models;
2
3public class TranscriptSegment
4{
5    public double Confidence { get; set; }
6    public double Start { get; set; }
7    public double End { get; set; }
8    public string Text { get; set; }
9    public string Channel { get; set; }
10    public string Speaker { get; set; }
11}

1namespace AiShortsGenerator.Models;
2
3public class Mp3File(string fileName, byte[] fileData)
4{
5    public string FileName { get; init; } = fileName;
6    public byte[] FileData { get; init; } = fileData;
7    public DateTime CreatedAt { get; init; }
8}

7.2 Crie um banco de dados de sua escolha e atualize o arquivo appsettings.Development.json:

1{
2  "GoogleApi": {
3    "GeminiKey": "your-gemini-api-key",
4    "TextToSpeechKey": "your-text-to-speech-api-key"
5  },
6  "AssemblyAi": {
7    "ApiKey": "your-assemblyai-api-key"
8  },
9  "CloudinaryUrl": "your-cloudinary-url",
10  "Cloudflare": {
11    "ApiKey": "your-cloudflare-api-key",
12    "AccountId": "your-cloudflare-account-id"
13  },
14  "ConnectionStrings": {
15    "DefaultConnection": "your-postgresql-connection-string"
16  }
17}

7.3 Adicione o PostgreSQL data e EF Core packages:

1dotnet add package Npgsql
2dotnet add package Npgsql.EntityFrameworkCore.PostgreSQL
3dotnet add package Microsoft.EntityFrameworkCore.Design

7.4 Crie e aplique as migrations:

1dotnet ef migrations add InitialCreate
2dotnet ef database update

7.5 Configure o DbContext:

1using AiShortsGenerator.Models;
2using Microsoft.EntityFrameworkCore;
3
4namespace AiShortsGenerator.Data;
5
6public class AppDbContext(DbContextOptions<AppDbContext> options) : DbContext(options)
7{
8    public DbSet<Video> Videos { get; set; }
9
10    protected override void OnModelCreating(ModelBuilder modelBuilder)
11    {
12        base.OnModelCreating(modelBuilder);
13
14        modelBuilder.Entity<Video>(entity =>
15        {
16            entity.Property(v => v.VideoContent)
17                .HasColumnType("jsonb");
18
19            entity.Property(v => v.Captions)
20                .HasColumnType("jsonb");
21        });
22    }
23}

Adicione ao Program.cs:

1builder.Services.AddDbContext<AppDbContext>(options =>
2{
3    var dataSourceBuilder = new NpgsqlDataSourceBuilder(builder.Configuration.GetConnectionString("DefaultConnection"));
4    dataSourceBuilder.EnableDynamicJson();
5    options.UseNpgsql(dataSourceBuilder.Build());
6});

7.6 Crie o endpoint para salvar um vídeo:

1app.MapPost("/save-video", async ([FromBody] Video video, AppDbContext context) =>
2{
3    await context.Videos.AddAsync(video);
4    await context.SaveChangesAsync();
5
6    return Results.Ok(new { message = "Video saved successfully", videoId = video.Id });
7})
8.WithName("SaveVideo")
9.Produces(200);

7.7 Crie o endpoint para atualizar um vídeo:

1app.MapPut("/videos/{id:int}", async (int id, AppDbContext context, [FromBody] UpdateVideoRequest request) =>
2{
3    var video = await context.Videos.FirstOrDefaultAsync(v => v.Id == id);
4
5    if (video == null)
6    {
7        return Results.NotFound();
8    }
9
10    if (string.IsNullOrEmpty(request.OutputFile))
11    {
12        return Results.Ok(video);
13    }
14
15    video.OutputFile = request.OutputFile;
16    video.RenderId = request.RenderId;
17    await context.SaveChangesAsync();
18
19    return Results.Ok(video);
20})
21.WithName("UpdateVideo")
22.Produces<Video>()
23.Produces(404);

7.8 Crie a DTO de requisição de atualização do vídeo:

1namespace AiShortsGenerator.DTOs;
2
3public class UpdateVideoRequest
4{
5    public string OutputFile { get; set; }
6    public string RenderId { get; set; }
7}

7.9 Crie o endpoint para listar vídeos:

1app.MapGet("/videos", async (AppDbContext context) =>
2{
3    var videos = await context.Videos.ToListAsync();
4
5    return Results.Ok(videos);
6})
7.WithName("GetVideos")
8.Produces<List<Video>>();

7.10 Crie o endpoint para excluir um vídeo:

1app.MapDelete("/videos/{id:int}", async (int id, AppDbContext context) =>
2{
3    var video = await context.Videos.FirstOrDefaultAsync(v => v.Id == id);
4
5    if (video == null)
6    {
7        return Results.NotFound();
8    }
9
10    context.Videos.Remove(video);
11    await context.SaveChangesAsync();
12
13    return Results.NoContent();
14})
15.WithName("DeleteVideo")
16.Produces(204)
17.Produces(404);

8. Finalizando o backend

8.1. Configuração do CORS (compartilhamento de recursos entre origens)

Como nosso frontend (Next.js) e backend (ASP.NET Core) são executados em portas diferentes durante o desenvolvimento, precisamos configurar o CORS para permitir a comunicação entre eles:

1builder.Services.AddCors(options =>
2{
3    options.AddPolicy("AllowSpecificOrigins", policy =>
4    {
5        policy.WithOrigins("<http://localhost:3000>") // Allow frontend
6            .AllowAnyHeader()
7            .AllowAnyMethod();
8    });
9
10    options.AddPolicy("AllowAll", policy =>
11    {
12        policy.AllowAnyOrigin()
13            .AllowAnyHeader()
14            .AllowAnyMethod();
15    });
16});
17
18var app = builder.Build();
19
20// Apply CORS based on the environment
21app.UseCors(app.Environment.IsDevelopment() ? "AllowAll" : "AllowSpecificOrigins");

Explicação:

Durante o desenvolvimento, ele permite todas as origens, facilitando a depuração.
Na produção, ele restringe o acesso apenas ao URL do frontend.

8.2. Configuração da migração do banco de dados

Para garantir que o esquema do banco de dados esteja atualizado, executamos as migrações doEntity Framework Core na inicialização do aplicativo.

Aplicar automaticamente as migrações:

1using (var scope = app.Services.CreateScope())
2{
3    var dbContext = scope.ServiceProvider.GetRequiredService<AppDbContext>();
4    dbContext.Database.Migrate(); // Applies pending migrations automatically
5}

Comandos de migração manual

Se você preferir executar migrações manualmente, use:

1# Create a new migration
2dotnet ef migrations add InitialCreate 
3
4# Apply migrations
5dotnet ef database update

Por que automatizar migrações?

Garante uma implementação tranquila sem intervenção manual.
Evita erros causados por um esquema de banco de dados desatualizado.

9. Executando o backend

Com tudo configurado, inicie o backend com:

1dotnet run

Implementação passo a passo do frontend

O front-end foi criado com o Next.js e fornece a interface do usuário para o gerenciamento dos vídeos.

1. Configuração do projeto

1.1 Instale as dependências

1npm install axios lucide-react

1.2 Adicione os componentes button, card, dialog, dropdown-menu, input, label, progress, select, separator, sheet, sidebar, skeleton, sonner, textarea e tooltip do shadcn/ui

1.3 Instale o Remotion seguindo este guia: Installing Remotion in an existing project

1.4 Configure as variáveis de ambiente (em .env.local)

1NEXT_PUBLIC_API_URL=http://localhost:5211

2. Implementar a Landing Page

2.1. Página inicial (app/(site)/page.tsx)

Página de destino com um botão CTA (Call To Action).

1import Link from 'next/link'
2import { Button } from '@/components/ui/button'
3
4export default function Home() {
5  return (
6    <div className='flex min-h-screen flex-col items-center justify-center px-4 text-center'>
7      <div className='w-full max-w-3xl'>
8        <h1 className='text-4xl font-bold sm:text-5xl md:text-6xl'>Create AI Shorts Instantly</h1>
9        <p className='mt-4 text-lg text-muted-foreground sm:text-xl'>
10          AI-generated short videos with subtitles and voiceovers—no editing needed!
11        </p>
12        <Button asChild className='mt-8 w-full sm:w-auto'>
13          <Link href='/dashboard/create-new'>Start Generating</Link>
14        </Button>
15      </div>
16    </div>
17  )
18}

2.2. Navbar Component (app/components/NavBar.tsx)

Crie uma barra de navegação reutilizável.

1import Link from 'next/link'
2
3export default function Navbar() {
4  return (
5    <nav className='p-4 shadow-md'>
6      <div className='container mx-auto flex justify-between'>
7        <Link href='/' className='text-xl font-bold'>AI Shorts</Link>
8        <Link href='/dashboard' className='text-gray-600 hover:underline'>Dashboard</Link>
9      </div>
10    </nav>
11  )
12}

2.3. Layout do site (app/(site)/layout.tsx)

Envolve a landing page com uma barra de navegação.

1import Navbar from '../components/NavBar'
2
3export default function SiteLayout({ children }: { children: React.ReactNode }) {
4  return (
5    <>
6      <Navbar />
7      <main className='grow'>{children}</main>
8    </>
9  )
10}

3. Implementar o dashboard

Implementei o dashboard seguindo este tutorial: The easiest way to build a sidebar menu in NextJs 15

3.1. Layout da dashboard(app/dashboard/layout.tsx)

1import DashboardSidebar from '../components/DashboardSidebar'
2
3export default function DashboardLayout({ children }: { children: React.ReactNode }) {
4  return (
5    <div className='flex'>
6      <DashboardSidebar />
7      <main className='flex-grow p-4'>{children}</main>
8    </div>
9  )
10}

3.2 DashboardSider Component(app/dashboard/_components/DashboardSidebar.tsx)

1import Link from 'next/link'
2import { ArrowLeft, FileVideo, LayoutDashboard } from 'lucide-react'
3
4import {
5  Sidebar,
6  SidebarContent,
7  SidebarGroup,
8  SidebarGroupContent,
9  SidebarGroupLabel,
10  SidebarHeader,
11  SidebarMenu,
12  SidebarMenuButton,
13  SidebarMenuItem,
14  SidebarRail,
15} from '@/components/ui/sidebar'
16
17// Menu items.
18const items = [
19  {
20    title: 'Dashboard',
21    url: '/dashboard',
22    icon: LayoutDashboard,
23  },
24  {
25    title: 'Create new',
26    url: '/dashboard/create-new',
27    icon: FileVideo,
28  },
29]
30
31export default function DashboardSidebar() {
32  return (
33    <Sidebar collapsible='icon' variant='inset'>
34      <SidebarHeader>
35        <SidebarMenu>
36          <SidebarMenuItem>
37            <SidebarMenuButton asChild>
38              <Link href='/' className='text-sky-700 hover:text-sky-600'>
39                <ArrowLeft />
40                <span>Back to site</span>
41              </Link>
42            </SidebarMenuButton>
43          </SidebarMenuItem>
44        </SidebarMenu>
45      </SidebarHeader>
46
47      <SidebarContent>
48        <SidebarGroup>
49          <SidebarGroupLabel>Dashboard</SidebarGroupLabel>
50
51          <SidebarGroupContent>
52            <SidebarMenu>
53              {items.map((item) => (
54                <SidebarMenuItem key={item.title}>
55                  <SidebarMenuButton asChild>
56                    <Link href={item.url}>
57                      <item.icon />
58                      <span>{item.title}</span>
59                    </Link>
60                  </SidebarMenuButton>
61                </SidebarMenuItem>
62              ))}
63            </SidebarMenu>
64          </SidebarGroupContent>
65        </SidebarGroup>
66      </SidebarContent>
67
68      <SidebarRail />
69    </Sidebar>
70  )
71}
72

3.3. Página da dashboard(app/dashboard/page.tsx)

Exibe a lista de vídeos e um botão "Criar novo vídeo ”.

1import Link from 'next/link'
2import { PlusCircle } from 'lucide-react'
3import { Button } from '@/components/ui/button'
4import ShortVideoGrid from './_components/ShortVideoGrid'
5
6export default function DashboardPage() {
7  return (
8    <div className='p-6'>
9      <div className='flex justify-between'>
10        <h1 className='text-2xl font-bold'>Dashboard</h1>
11        <Link href='/dashboard/create-new'>
12          <Button>
13            <PlusCircle className='mr-2' /> Create New Video
14          </Button>
15        </Link>
16      </div>
17      <ShortVideoGrid />
18    </div>
19  )
20}

3.4 Short Video Grid Component (app/dashboard/_components/ShortVideoGrid.tsx)

1'use client'
2
3import { useEffect, useState } from 'react'
4import { PlusCircle } from 'lucide-react'
5import axios from 'axios'
6import Link from 'next/link'
7import { Thumbnail } from '@remotion/player'
8
9import { MyComposition } from '@/remotion/Composition'
10import type { VideoData } from '@/app/lib/interface'
11import { Button } from '@/components/ui/button'
12
13import { SkeletonCard } from './SkeletonCard'
14
15type ShortVideoGridData = {
16  createdAt: string
17  id: number // Assuming each video has a unique id
18} & VideoData
19
20export default function ShortVideoGrid() {
21  const [loading, setLoading] = useState(false)
22  const [videos, setVideos] = useState<ShortVideoGridData[]>([])
23
24  const GetVideos = async () => {
25    setLoading(true)
26    const resp = await axios.get(`${process.env.NEXT_PUBLIC_API_URL}/videos`)
27    if (resp.data) {
28      const sortedVideos = resp.data.sort(
29        (a: ShortVideoGridData, b: ShortVideoGridData) =>
30          new Date(b.createdAt).getTime() - new Date(a.createdAt).getTime(),
31      )
32      setVideos(sortedVideos)
33    }
34
35    setLoading(false)
36  }
37
38  useEffect(() => {
39    GetVideos()
40  }, [])
41
42  if (loading) {
43    return <SkeletonCard count={10} />
44  }
45
46  if (videos.length === 0) {
47    return (
48      <div className='flex h-64 flex-col items-center justify-center'>
49        <p className='mb-4 text-xl'>No short videos yet</p>
50        <Link href={'/dashboard/create-new'}>
51          <Button>
52            <PlusCircle className='mr-2 size-4' /> Create New Short Video
53          </Button>
54        </Link>
55      </div>
56    )
57  }
58
59  return (
60    <>
61      <div className='grid grid-cols-1 gap-4 sm:grid-cols-2 md:grid-cols-3 lg:grid-cols-4 xl:grid-cols-5'>
62        {videos.map((video) => (
63          <button
64            key={video.id} // Use a unique identifier as the key
65            className='relative aspect-square h-[450px] w-[300px] overflow-hidden rounded-lg transition-all duration-300 ease-in-out hover:scale-105 focus:outline-none focus:ring-2 focus:ring-primary'
66          >
67            <Thumbnail
68              component={MyComposition}
69              compositionWidth={300}
70              compositionHeight={450}
71              frameToDisplay={30}
72              durationInFrames={120}
73              fps={30}
74              inputProps={{
75                ...video,
76                setDurationInFrame: (a: number) => console.log(a),
77              }}
78            />
79          </button>
80        ))}
81      </div>
82    </>
83  )
84}

3.5 Skeleton Card Component (app/dashboard/_components/SkeletonCard.tsx)

1import { Skeleton } from '@/components/ui/skeleton'
2
3export const SkeletonCard = ({ count = 5 }) => {
4  return (
5    <div className='grid grid-cols-1 gap-4 sm:grid-cols-2 md:grid-cols-3 lg:grid-cols-4 xl:grid-cols-5'>
6      {Array.from({ length: count }).map((_, index) => (
7        // eslint-disable-next-line react/no-array-index-key
8        <Skeleton key={index} className='h-[450px] w-[300px] rounded-xl' />
9      ))}
10    </div>
11  )
12}

4. Implementar a geração de vídeo

Trata a entrada do usuário (tópico, estilo, duração).
Busca o conteúdo gerado pela IA (script, áudio, legendas, imagens).
Reproduz o vídeo gerado.

4.1 Crie types se estiver usando TypeScript:

1export type VideoContentItem = {
2  imagePrompt: string
3  contextText: string
4}
5
6export type TranscriptSegment = {
7  confidence: number
8  start: number
9  end: number
10  text: string
11  channel: string | null
12  speaker: string | null
13}
14
15export type VideoData = {
16  id?: number
17  videoContent: VideoContentItem[]
18  audioFileUrl: string
19  captions: TranscriptSegment[]
20  images: string[]
21  outputFile?: string
22  renderId?: string
23}

4.2 Select Topic components app/dashboard/create-new/_components/SelectTopic.tsx:

1'use client'
2
3import { useState } from 'react'
4
5import { Label } from '@/components/ui/label'
6import {
7  Select,
8  SelectContent,
9  SelectItem,
10  SelectTrigger,
11  SelectValue,
12} from '@/components/ui/select'
13import { Textarea } from '@/components/ui/textarea'
14
15const options = [
16  'Custom Prompt',
17  'Random AI Story',
18  'Historical Facts',
19  'Fun Facts',
20  'Science Facts',
21  'Motivational',
22  'Scary Story',
23  'Adventure Story',
24  'Fantasy Story',
25  'Sci-Fi Story',
26  'Steampunk Story',
27  'Romance Story',
28  'Mystery/Thriller Story',
29  'Historical Fiction',
30  'Poems',
31  'Tech Trends',
32  'Philosophical Quotes',
33  'Space Exploration',
34  'Mythology',
35]
36
37type SelectTopicProps = {
38  // eslint-disable-next-line no-unused-vars
39  onUserSelect: (fieldName: string, fieldValue: string) => void
40}
41
42export default function SelectTopic({ onUserSelect }: SelectTopicProps) {
43  const [contentType, setContentType] = useState('')
44  return (
45    <div className='space-y-2'>
46      <Label htmlFor='content-type' className='text-lg font-semibold'>
47        Content Type
48      </Label>
49      <Select
50        onValueChange={(value) => {
51          setContentType(value)
52          if (value !== 'Custom Prompt') {
53            onUserSelect('topic', value)
54          }
55        }}
56        value={contentType}
57      >
58        <SelectTrigger id='content-type' name='topic'>
59          <SelectValue placeholder='Select content type' />
60        </SelectTrigger>
61        <SelectContent>
62          {options.map((item) => (
63            <SelectItem key={item} value={item}>
64              {item}
65            </SelectItem>
66          ))}
67        </SelectContent>
68      </Select>
69      {contentType === 'Custom Prompt' && (
70        <Textarea
71          onChange={(e) => onUserSelect('topic', e.target.value)}
72          placeholder='Write your custom prompt'
73        />
74      )}
75    </div>
76  )
77}

4.3 Select Style components app/dashboard/create-new/_components/SelectStyle.tsx:

1'use client'
2
3import { useState } from 'react'
4import Image from 'next/image'
5
6import { Card, CardTitle } from '@/components/ui/card'
7
8const options = [
9  {
10    name: 'Realistic',
11    image: '/images/realistic.png',
12  },
13  {
14    name: 'Cartoon',
15    image: '/images/cartoon.png',
16  },
17  {
18    name: 'Comic',
19    image: '/images/comic.png',
20  },
21  {
22    name: 'WaterColor',
23    image: '/images/watercolor.png',
24  },
25  {
26    name: 'Drawing',
27    image: '/images/drawing.png',
28  },
29  {
30    name: 'Monochrome',
31    image: '/images/monochrome.png',
32  },
33  {
34    name: 'Oil Painting',
35    image: '/images/oil-painting.png',
36  },
37  {
38    name: 'Pixel Art',
39    image: '/images/pixel-art.png',
40  },
41  {
42    name: 'retro',
43    image: '/images/retro.png',
44  },
45  {
46    name: 'Surreal',
47    image: '/images/surreal.png',
48  },
49]
50
51type SelectStyleProps = {
52  // eslint-disable-next-line no-unused-vars
53  onUserSelect: (fieldName: string, fieldValue: string) => void
54}
55
56export default function SelectStyle({ onUserSelect }: SelectStyleProps) {
57  const [selectOption, setSelectOption] = useState('')
58  return (
59    <div className='space-y-2'>
60      <legend className='mb-4 text-lg font-semibold'>Image Style</legend>
61      <div
62        id='style'
63        className='grid grid-cols-2 gap-5 md:grid-cols-3 lg:grid-cols-5 xl:grid-cols-6'
64      >
65        {options.map((item) => (
66          <Card
67            onClick={() => {
68              setSelectOption(item.name)
69              onUserSelect('imageStyle', item.name)
70            }}
71            key={item.name}
72            className={`cursor-pointer transition-all ${
73              selectOption === item.name
74                ? 'border-4 border-black dark:border-white'
75                : 'hover:scale-105'
76            }`}
77          >
78            <div className='relative aspect-square w-full'>
79              <div className='absolute inset-0'>
80                <Image
81                  alt='Image'
82                  className={`h-auto w-full rounded-lg object-cover ${
83                    selectOption === item.name ? 'opacity-50' : 'opacity-100'
84                  }`}
85                  height='1024'
86                  src={item.image}
87                  width='1024'
88                  priority
89                />
90              </div>
91              <div className='absolute inset-x-0 bottom-0 rounded-b-lg bg-black/75 p-2'>
92                <CardTitle className='text-2xl font-bold text-white'>
93                  {item.name}
94                </CardTitle>
95              </div>
96            </div>
97          </Card>
98        ))}
99      </div>
100    </div>
101  )
102}

As imagens foram criadas usando a IA do Cloudflare Workers.

4.4 Selecionar componentes de duração app/dashboard/create-new/_components/SelectDuration.tsx:

1import { Label } from '@/components/ui/label'
2import {
3  Select,
4  SelectContent,
5  SelectItem,
6  SelectTrigger,
7  SelectValue,
8} from '@/components/ui/select'
9
10type SelectDurationProps = {
11  // eslint-disable-next-line no-unused-vars
12  onUserSelect: (fieldName: string, fieldValue: string) => void
13}
14
15export default function SelectDuration({ onUserSelect }: SelectDurationProps) {
16  return (
17    <div className='space-y-2'>
18      <Label htmlFor='video-duration' className='text-lg font-semibold'>
19        Video Duration
20      </Label>
21      <Select onValueChange={(value) => onUserSelect('duration', value)}>
22        <SelectTrigger id='video-duration' name='duration'>
23          <SelectValue placeholder='Select duration' />
24        </SelectTrigger>
25        <SelectContent>
26          <SelectItem value='15 seconds'>15 seconds</SelectItem>
27          <SelectItem value='30 seconds'>30 seconds</SelectItem>
28          <SelectItem value='60 seconds'>60 Seconds</SelectItem>
29        </SelectContent>
30      </Select>
31    </div>
32  )
33}

4.5 Loading Component app/components/Loading.tsx:

1import { Progress } from '@/components/ui/progress'
2
3type LoadingProps = {
4  loading: boolean
5  progress?: number
6  message: string
7  showProgress?: boolean
8}
9
10export default function Loading({
11  loading,
12  progress,
13  message,
14  showProgress = true,
15}: LoadingProps) {
16  if (!loading) return null
17
18  return (
19    <div
20      className={`fixed inset-0 z-50 flex items-center justify-center bg-black bg-opacity-50 ${loading ? 'flex' : 'hidden'}`}
21    >
22      <div className='flex flex-col items-center space-y-4'>
23        <div className='size-12 animate-spin rounded-full border-4 border-gray-400 border-t-transparent' />
24        <p className='text-white dark:text-gray-300'>{message}</p>
25
26        {showProgress && progress !== undefined && (
27          <Progress
28            value={progress}
29            max={100}
30            className='mt-4 h-2 w-64 rounded-full bg-gray-300 dark:bg-gray-700'
31          />
32        )}
33      </div>
34    </div>
35  )
36}
37

4.6 Page create new app/dashboard/create-new/page.tsx:

1'use client'
2
3import { useEffect, useState } from 'react'
4import axios from 'axios'
5
6import { Button } from '@/components/ui/button'
7import {
8  Card,
9  CardContent,
10  CardFooter,
11  CardHeader,
12  CardTitle,
13} from '@/components/ui/card'
14import Loading from '@/app/components/Loading'
15import type { VideoContentItem, VideoData } from '@/app/lib/interface'
16
17import SelectTopic from './_components/SelectTopic'
18import SelectStyle from './_components/SelectStyle'
19import SelectDuration from './_components/SelectDuration'
20
21type FormData = {
22  topic: string
23  imageStyle: string
24  duration: string
25}
26
27export default function CreateNew() {
28  const [isLoading, setIsLoading] = useState(false)
29  const [loadingMessage, setLoadingMessage] = useState(
30    'Generating your video...',
31  )
32  const [progress, setProgress] = useState(0)
33  const [formData, setFormData] = useState<FormData>({} as FormData)
34  const [videoData, setVideoData] = useState<VideoData>({
35    videoContent: [],
36    audioFileUrl: '',
37    captions: [],
38    images: [],
39  })
40
41  useEffect(() => {
42    if (
43      videoData.videoContent.length > 0 &&
44      videoData.audioFileUrl &&
45      videoData.captions.length > 0 &&
46      videoData.images.length > 0
47    ) {
48      SaveVideoToDatabase(videoData)
49    }
50  }, [videoData])
51
52  const onHandleInputChange = (fieldName: string, fieldValue: string) => {
53    setFormData((prev) => ({
54      ...prev,
55      [fieldName]: fieldValue,
56    }))
57  }
58
59  const onCreateSubmitHandler = (e: React.FormEvent) => {
60    e.preventDefault()
61    getVideoContent()
62  }
63
64  const getVideoContent = async () => {
65    setIsLoading(true)
66    setLoadingMessage('Generating video content...')
67    setProgress(20)
68
69    const prompt = `Generate a script for a video lasting ${formData.duration} seconds on the topic '${formData.topic}'. For each scene, provide the following in JSON format: [{'ContextText': '<Description of the scene (concise and fitting the duration)>','ImagePrompt': '<AI image generation prompt in ${formData.imageStyle} style>'}] Ensure all fields are well-structured, and do not include plain text outside the JSON.`
70
71    const resp = await axios.post(
72      `${process.env.NEXT_PUBLIC_API_URL}/generate-content`,
73      {
74        input: prompt,
75      },
76    )
77
78    if (resp.data) {
79      setVideoData((prev) => {
80        return {
81          ...prev,
82          videoContent: resp.data,
83        }
84      })
85      await GenerateAudioFile(resp.data)
86    }
87  }
88
89  const GenerateAudioFile = async (videoContentData: VideoContentItem[]) => {
90    setLoadingMessage('Generating audio file...')
91    setProgress(50)
92    let script = ''
93    videoContentData.forEach((item) => {
94      script = script + item.contextText + ''
95    })
96    const resp = await axios.post(
97      `${process.env.NEXT_PUBLIC_API_URL}/generate-audio`,
98      {
99        input: script,
100      },
101    )
102    if (resp.data) {
103      setVideoData((prev) => {
104        return {
105          ...prev,
106          audioFileUrl: resp.data,
107        }
108      })
109      await GenerateCaptions(resp.data, videoContentData)
110    }
111  }
112
113  const GenerateCaptions = async (
114    fileUrl: string,
115    videoContentData: VideoContentItem[],
116  ) => {
117    setLoadingMessage('Generating captions...')
118    setProgress(75)
119    const resp = await axios.post(
120      `${process.env.NEXT_PUBLIC_API_URL}/generate-captions`,
121      {
122        fileUrl,
123      },
124    )
125    if (resp.data) {
126      setVideoData((prev) => {
127        return {
128          ...prev,
129          captions: resp.data,
130        }
131      })
132      await GenerateImage(videoContentData)
133    }
134  }
135
136  const SaveVideoToDatabase = async (videoData: VideoData) => {
137    try {
138      await axios.post(`${process.env.NEXT_PUBLIC_API_URL}/save-video`, {
139        videoContent: videoData.videoContent,
140        captions: videoData.captions,
141        images: videoData.images,
142        audioFileUrl: videoData.audioFileUrl,
143      })
144    } catch (error) {
145      console.error('Error saving video:', error)
146    }
147  }
148
149  const GenerateImage = async (videoContent: VideoContentItem[]) => {
150    setLoadingMessage(
151      'Generating images... This part can take a minute or two.',
152    )
153    setProgress(90)
154    const responseImages: string[] = []
155    for (const item of videoContent) {
156      try {
157        const resp = await axios.post(
158          `${process.env.NEXT_PUBLIC_API_URL}/generate-image`,
159          {
160            prompt: item.imagePrompt,
161          },
162        )
163        responseImages.push(resp.data)
164      } catch (e) {
165        console.log('Error:' + e)
166      }
167    }
168    setVideoData((prev) => {
169      return {
170        ...prev,
171        images: responseImages,
172      }
173    })
174    setProgress(100)
175    setIsLoading(false)
176  }
177
178  return (
179    <div className='container mx-auto px-4 py-8'>
180      <Card className='mx-auto max-w-full'>
181        <CardHeader>
182          <CardTitle className='text-center text-2xl font-bold'>
183            Create New Short Video
184          </CardTitle>
185        </CardHeader>
186        <form onSubmit={onCreateSubmitHandler}>
187          <CardContent className='space-y-4'>
188            <SelectTopic onUserSelect={onHandleInputChange} />
189            <SelectStyle onUserSelect={onHandleInputChange} />
190            <SelectDuration onUserSelect={onHandleInputChange} />
191          </CardContent>
192          <CardFooter>
193            <Button type='submit' className='w-full'>
194              Generate
195            </Button>
196          </CardFooter>
197        </form>
198      </Card>
199      <Loading
200        loading={isLoading}
201        progress={progress}
202        message={loadingMessage}
203      />
204    </div>
205  )
206}

5. Criar a caixa de diálogo com o player de vídeo do Remotion

Depois que tudo for gerado com sucesso, reproduz o vídeo.

5.1 Componente VideoPlayerDialog app/components/VideoPlayerDialog.tsx:

1import { useEffect, useState } from 'react'
2import { Player } from '@remotion/player'
3import axios from 'axios'
4import { useRouter } from 'next/navigation'
5import { Download, Trash, X } from 'lucide-react'
6
7import {
8  Dialog,
9  DialogContent,
10  DialogDescription,
11  DialogFooter,
12  DialogHeader,
13  DialogTitle,
14} from '@/components/ui/dialog'
15import { Button } from '@/components/ui/button'
16import { MyComposition } from '@/remotion/Composition'
17import type { VideoData } from '@/app/lib/interface'
18
19import Loading from './Loading'
20
21type VideoPlayerDialogProps = {
22  video: VideoData | null
23  isOpen: boolean
24  onClose: () => void
25  refreshVideos?: () => void
26}
27
28export function VideoPlayerDialog({
29  video,
30  isOpen,
31  onClose,
32  refreshVideos,
33}: VideoPlayerDialogProps) {
34  const router = useRouter()
35  const [isLoading, setIsLoading] = useState(false)
36  const [outputFileUrl, setOutputFileUrl] = useState<string | null>(null)
37
38  useEffect(() => {
39    if (video?.outputFile) {
40      setOutputFileUrl(video.outputFile)
41    } else {
42      setOutputFileUrl(null)
43    }
44  }, [video])
45
46  if (!video) {
47    return null
48  }
49
50  const durationInFrame =
51    video.captions.length > 0
52      ? Math.ceil((video.captions[video.captions.length - 1].end / 1000) * 30)
53      : 700
54
55  const handleCancel = () => {
56    onClose()
57    router.push('/dashboard')
58  }
59
60  return (
61    <Dialog open={isOpen} onOpenChange={onClose}>
62      <DialogContent className='max-w-xl p-0'>
63        <DialogHeader className='border-b p-4'>
64          <DialogTitle className='text-lg font-semibold'>
65            Generated Video
66          </DialogTitle>
67          <DialogDescription>
68            Preview your generated video. You can export, delete or cancel to go
69            back the dashboard.
70          </DialogDescription>
71        </DialogHeader>
72
73        <Loading
74          loading={isLoading}
75          showProgress={false}
76          message='Rendering video, please wait...'
77        />
78        <div className='flex justify-center p-4'>
79          {video && (
80            <Player
81              className='h-auto w-full rounded-md shadow-md'
82              component={MyComposition}
83              durationInFrames={Number(durationInFrame.toFixed(0))}
84              compositionWidth={300}
85              compositionHeight={450}
86              fps={30}
87              controls
88              inputProps={{
89                ...video,
90              }}
91            />
92          )}
93        </div>
94        
95         <DialogFooter className='gap-4 border-t p-4 sm:justify-between'>
96          <div className='flex space-x-2'>
97            <Button type='button' variant='secondary' onClick={handleCancel}>
98              <X className='mr-2 size-4' />
99              Cancel
100            </Button>
101          </div>
102        </DialogFooter>
103      </DialogContent>
104    </Dialog>
105  )
106}

5.2 Adicione a caixa de diálogo do player de vídeo ao dashboard/create-new/page.tsx:

1export default function CreateNew() {
2  const [playVideo, setPlayVideo] = useState(false)

O playVideo é um estado booleano que determina se o VideoPlayerDialog deve ser aberto (true) ou fechado (false).
Ele é inicialmente definido como false, o que significa que a caixa de diálogo não é exibida até que o vídeo esteja pronto.

1setPlayVideo(true)
2  }
3
4  return (

Quando as imagens são geradas com êxito, playVideo é definido como true, acionando a caixa de diálogo do player de vídeo.

1<VideoPlayerDialog
2        isOpen={playVideo}
3        onClose={() => setPlayVideo(false)}
4        video={videoData}
5      />
6    </div>
7  )
8}
9

5.3 Adicione também a caixa de diálogo com o player de vídeo ao arquivo /dashboard/_components/ShortVideoGrid.tsx

1export default function ShortVideoGrid() {
2  const [selectedVideo, setSelectedVideo] = useState<VideoData | null>(null)
3
4
5<button
6            key={video.id} // Use a unique identifier as the key
7            className='relative aspect-square h-[450px] w-[300px] overflow-hidden rounded-lg transition-all duration-300 ease-in-out hover:scale-105 focus:outline-none focus:ring-2 focus:ring-primary'
8            onClick={() => setSelectedVideo(video)}
9          >
10
11
12<VideoPlayerDialog
13        isOpen={!!selectedVideo}
14        onClose={() => setSelectedVideo(null)}
15        video={selectedVideo}
16        refreshVideos={GetVideos}
17      />
18    </>
19  )
20}

5.4 Style o Remotion Player em remotion/Composition.tsx que você criou seguindo este guia: Installing Remotion in an existing project

1import { TranscriptSegment, VideoData } from '@/app/lib/interface'
2import { useEffect, useState } from 'react'
3import {
4  AbsoluteFill,
5  Audio,
6  Img,
7  interpolate,
8  Sequence,
9  useCurrentFrame,
10  useVideoConfig,
11} from 'remotion'
12
13export const MyComposition = ({
14  audioFileUrl,
15  captions,
16  images,
17}: VideoData) => {
18  const { fps } = useVideoConfig()
19  const frame = useCurrentFrame()
20  const [durationFrame, setDurationFrame] = useState<number | null>(null)
21
22  useEffect(() => {
23    if (captions.length > 0) {
24      const lastSegment = captions[captions.length - 1]
25      const calculatedDurationFrame = (lastSegment.end / 1000) * fps
26      setDurationFrame(calculatedDurationFrame)
27    }
28  }, [captions, fps])
29
30  if (durationFrame === null) {
31    return null
32  }
33
34  const getCurrentCaption = () => {
35    const currentTime = (frame / 30) * 1000
36    const currentCaption = captions.find(
37      (word: TranscriptSegment) =>
38        currentTime >= word.start && currentTime <= word.end,
39    )
40    return currentCaption ? currentCaption.text : ''
41  }
42
43  return (
44    <AbsoluteFill>
45      {images.map((item, index) => {
46        const key = item || `image-${index}`
47        const startTime = (index * durationFrame) / images.length
48        const duration = durationFrame
49        const scale = interpolate(
50          frame,
51          [startTime, startTime + duration / 2, startTime + duration],
52          [1, 1.2, 1],
53          { extrapolateLeft: 'clamp', extrapolateRight: 'clamp' },
54        )
55
56        return (
57          <Sequence key={key} from={startTime} durationInFrames={durationFrame}>
58            <AbsoluteFill
59              style={{
60                display: 'flex',
61                justifyContent: 'center',
62                alignItems: 'center',
63              }}
64            >
65              <Img
66                src={item}
67                style={{
68                  width: '100%',
69                  height: '100%',
70                  objectFit: 'cover',
71                  transform: `scale(${scale})`,
72                }}
73              />
74              <AbsoluteFill
75                style={{
76                  display: 'flex',
77                  justifyContent: 'center',
78                  alignItems: 'center',
79                  textAlign: 'center',
80                  fontSize: '1.25rem',
81                  color: 'white',
82                  position: 'absolute',
83                  top: undefined,
84                  bottom: 0,
85                  height: '200px',
86                }}
87              >
88                <p>{getCurrentCaption()}</p>
89              </AbsoluteFill>
90            </AbsoluteFill>
91          </Sequence>
92        )
93      })}
94      <Audio src={audioFileUrl} />
95    </AbsoluteFill>
96  )
97}

6. Renderização de um vídeo

Usar o Remotion Lambda para gerar quadros e codificar o vídeo.
Armazenamento do output URL e atualização do nosso banco de dados.

6.1 Obtenha o URL e o bucket do AWS Serve seguindo este guia da documentação do Remotion e configure as seguintes variáveis de ambiente no seu arquivo .env.local

1NEXT_PUBLIC_API_URL=<Your API URL>
2REMOTION_AWS_SERVE_URL=<Your AWS Serve URL for Remotion>
3REMOTION_AWS_BUCKET_NAME=<Your AWS Bucket URL for Remotion>

6.2 Rota da API (app/api/render-video/route.ts)

1import {
2  getFunctions,
3  getRenderProgress,
4  renderMediaOnLambda,
5} from '@remotion/lambda/client'
6import type { NextRequest } from 'next/server'
7import { NextResponse } from 'next/server'
8import axios from 'axios'
9import type { VideoData } from '@/app/lib/interface'
10
11const serveUrl = process.env.REMOTION_AWS_SERVE_URL
12
13export const POST = async (req: NextRequest) => {
14  if (!serveUrl) {
15    return NextResponse.json({ error: 'Serve URL not defined' }, { status: 500 })
16  }
17
18  const { id, audioFileUrl, captions, images }: VideoData = await req.json()
19
20  try {
21    // Fetch available Lambda functions
22    const functions = await getFunctions({
23      region: 'us-east-1',
24      compatibleOnly: true,
25    })
26
27    if (functions.length === 0) {
28      throw new Error('No compatible Lambda functions found.')
29    }
30
31    const functionName = functions[0].functionName
32    const captionsDuration = captions[captions.length - 1].end / 1000
33    const fps = 30
34    const durationInFrames = Math.ceil(captionsDuration * fps)
35
36    // Start rendering process on AWS Lambda
37    const { renderId, bucketName } = await renderMediaOnLambda({
38      region: 'us-east-1',
39      functionName,
40      serveUrl,
41      composition: 'shortVideo',
42      inputProps: { audioFileUrl, captions, images, durationInFrames },
43      codec: 'h264',
44      maxRetries: 1,
45      framesPerLambda: 100,
46      privacy: 'public',
47    })
48
49    // Poll for render progress
50    while (true) {
51      await new Promise((resolve) => setTimeout(resolve, 1000))
52      const progress = await getRenderProgress({ renderId, bucketName, functionName, region: 'us-east-1' })
53
54      if (progress.done) {
55        const outputFile = progress.outputFile
56        console.log('Render finished!', outputFile)
57
58        // Save the rendered video output URL
59        await axios.put(`${process.env.NEXT_PUBLIC_API_URL}/videos/${id}`, {
60          outputFile,
61          renderId,
62        })
63
64        return NextResponse.json({ outputFile })
65      }
66
67      if (progress.fatalErrorEncountered) {
68        console.error('Error encountered', progress.errors)
69        return NextResponse.json({ error: progress.errors }, { status: 500 })
70      }
71    }
72  } catch (error) {
73    console.error('Error rendering video:', error)
74    return NextResponse.json({ error }, { status: 500 })
75  }
76}
77

6.2 Pegue os dados passados durante o processo de renderização em remotion/Root.tsx:

1import { Composition, getInputProps } from 'remotion'
2import { MyComposition } from './Composition'
3import { VideoData } from '@/app/lib/interface'
4
5const { video, durationInFrames } = getInputProps() as {
6  video: VideoData
7  durationInFrames: number
8}
9
10export const RemotionRoot = () => {
11  return (
12    <Composition
13      id='shortVideo'
14      component={MyComposition}
15      durationInFrames={durationInFrames}
16      width={300}
17      height={450}
18      fps={30}
19      defaultProps={{
20        ...video,
21      }}
22    />
23  )
24}

6.3 Crie a função de exportação de vídeo em app/components/VideoPlayerDialog.tsx:

1const exportVideo = async () => {
2  if (!video) return
3  setIsLoading(true)
4
5  try {
6    const result = await axios.post('/api/render-video', {
7      id: video.id,
8      audioFileUrl: video.audioFileUrl,
9      captions: video.captions,
10      images: video.images,
11    })
12
13    console.log('Video Rendered:', result.data)
14
15    if (result.data.outputFile) {
16      setOutputFileUrl(result.data.outputFile)
17      window.open(result.data.outputFile, '_blank')
18    }
19  } catch (error) {
20    console.error('Error rendering video:', error)
21  } finally {
22    setIsLoading(false)
23  }
24}

6.4 Adicione o botão para renderizar o vídeo ou abrir o output URL se ele já estiver sido renderizado:

1<DialogFooter className='gap-4 border-t p-4 sm:justify-between'>
2          <div className='flex space-x-2'>
3            <Button type='button' variant='secondary' onClick={handleCancel}>
4              <X className='mr-2 size-4' />
5              Cancel
6            </Button>
7            {outputFileUrl ? (
8              <Button
9                type='button'
10                onClick={() => window.open(outputFileUrl, '_blank')}
11              >
12                Open
13              </Button>
14            ) : (
15              <Button type='button' onClick={exportVideo}>
16                <Download className='mr-2 size-4' />
17                Export
18              </Button>
19            )}
20          </div>
21        </DialogFooter>

7. Exclusão de um vídeo

Remova o arquivo de vídeo do AWS S3.
Exclua os dados do nosso banco de dados.

7.1 Rota da API (app/api/delete-video/route.ts)

1import { deleteRender } from '@remotion/lambda/client'
2import axios from 'axios'
3import type { NextRequest } from 'next/server'
4import { NextResponse } from 'next/server'
5
6const bucketName = process.env.REMOTION_AWS_BUCKET_NAME
7
8export const DELETE = async (req: NextRequest) => {
9  if (!bucketName) {
10    return NextResponse.json({ error: 'BucketName not defined' }, { status: 500 })
11  }
12
13  const { renderId, videoId } = await req.json()
14
15  try {
16    // Delete the video render from AWS S3
17    if (renderId) {
18      await deleteRender({ bucketName, region: 'us-east-1', renderId })
19    }
20
21    // Remove video metadata from the database
22    await axios.delete(`${process.env.NEXT_PUBLIC_API_URL}/videos/${videoId}`)
23
24    return NextResponse.json({ message: 'Video deleted successfully' }, { status: 200 })
25  } catch (error) {
26    console.error('Error deleting video:', error)
27    return NextResponse.json({ error: 'Failed to delete video', details: error }, { status: 500 })
28  }
29}

7.2 Crie a função para deletar o vídeo:

1const handleDelete = async () => {
2    onClose()
3    try {
4      await axios.delete('/api/delete-video', {
5        data: {
6          videoId: video.id,
7          renderId: video.renderId,
8        },
9      })
10      if (refreshVideos) {
11        refreshVideos()
12      }
13    } catch (error) {
14      console.error(error)
15    }
16  }

7.3 Adicione o botão de deletar:

1<DialogFooter className='gap-4 border-t p-4 sm:justify-between'>
2          <Button type='button' variant='destructive' onClick={handleDelete}>
3            <Trash className='mr-2 size-4' />
4            Delete
5          </Button>
6          <div className='flex space-x-2'>
7            <Button type='button' variant='secondary' onClick={handleCancel}>
8              <X className='mr-2 size-4' />
9              Cancel
10            </Button>
11            {outputFileUrl ? (
12              <Button
13                type='button'
14                onClick={() => window.open(outputFileUrl, '_blank')}
15              >
16                Open
17              </Button>
18            ) : (
19              <Button type='button' onClick={exportVideo}>
20                <Download className='mr-2 size-4' />
21                Export
22              </Button>
23            )}
24          </div>
25        </DialogFooter>

Conclusão

A criação de um gerador de vídeos curtos com tecnologia de IA envolve a integração de várias tecnologias, incluindo Next.js, Remotion, AWS Lambda e transcrição/legendas orientadas por IA. Abordamos os principais aspectos da implementação de renderização e exclusão de vídeo.

Esta implementação é apenas uma base — há espaço para mais melhoria, como:

Adicionar autenticação de usuário
Transformá-lo em um SaaS
Adicionar efeitos e animações de vídeo adicionais
Adicionar Vozes adicionais

Com essas melhorias, o aplicativo pode se tornar uma ferramenta poderosa para criação automatizada de vídeos curtos.

Código-fonte

Você pode encontrar a implementação completa no repositório do GitHub:

🔗 Repositório do GitHub