Android offers various APIs for video encoding, decoding, and editing. In my other posts I already showed some examples on how to use these API's. In this post I would like to extend the information from my previous post and add a new use case - embedding text into existing video.

If you want to add captions, titles, or some other textual information to an existing video file, using only standard Android APIs, this post might be something for you. Additionally, the technique I show is basically overlaying video frames with text that is converted to a texture. This means you can put any kind of texture on top of the video. It doesn't have to be text.

It should be possible to use the information here without reading my previous posts. But if you still have questions, I recommend to check out the following posts

Creating video from images

Adding audio to video

Converting video to greyscale

You can find sample code related to this post on github.

If you you're an Android enthusiast that likes to learn more about Android internals, I highly recommend to check out my Bugjaeger app. It allows you to connect 2 Android devices through USB OTG and perform many of the tasks that are normally only accessible from a developer machine via ADB directly from your Android phone/tablet.

High-Level Overview

Here's a simplified summarization of the steps that I will perform to create the final video file

  1. Use MediaExtractor to extract metadata and encoded frames from video file
  2. Decode the encoded frames with MediaCodec
  3. Render decoded video frame with OpenGL ES2
  4. Render text
  5. Encode the processed frames with MediaCodec again
  6. Save the encoded video in proper container format into a file using MediaMuxer

Extracting Frames From Video File

Before you'll be able to edit existing video and add text to it, you first need to extract the encoded frames from the given container format. Android allows extracting and decoding of frames from multiple video formats.

The class that allows you to parse a video (or audio) file and extract metadata and encoded frames is called MediaExtractor.

I already showed how to use MediaExtractor in my previous post. I don't want to repeat too much of the information, so here's just a quick summary

val extractor = MediaExtractor()
extractor.setDataSource(inFilePath)

for (i in 0 until extractor.trackCount) {
    val format = extractor.getTrackFormat(i)
    val mime = format.getString(MediaFormat.KEY_MIME);

    if (mime.startsWith("video/")) {

        extractor.selectTrack(i)
        // Read the frames from this track here
        // ...
    }
}

The code above allows you to select the input video track on top of which you would like to render the text.

Once you selected the video track, you can extract encoded video frames with the following code

val maxChunkSize = 1024 * 1024
val buffer = ByteBuffer.allocate(maxChunkSize)
val bufferInfo = MediaCodec.BufferInfo()

// Extract all frames from selected track
while (true) {
    val chunkSize = videoExtractor.readSampleData(buffer, 0)

    if (chunkSize > 0) {
        // Process extracted frame here
        // ...

        videoExtractor.advance()

    } else {
    // All frames extracted - we're done
        break
    }
}

You can then pass the encoded frames to a MediaCodec decoder which then makes the frame available to OpenGL ES2 for editing and rendering.

Initializing MediaCodec & Surfaces

To accomplish the given task, you'll normally need 2 instances of MediaCodec - encoder and decoder.

MediaCodec will need a Surface for moving the video data. Surface is an opaque handle to video buffers. You'll also need 2 Surfaces. One Surface will be provided directly by the MediaCodec encoder (input Surface). The other Surface you can get with the help of SurfaceTexture and EGL.

Here's how you can initialize the encoder, decoder, and both Surfaces

// Configure video output format - here you can adjust values according to input Format
// that you've got from MediaExtractor in previous section
val mime = "video/avc"
val width = 320; val height = 180
val outFormat = MediaFormat.createVideoFormat(mime, width, height)
outFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface)
outFormat.setInteger(MediaFormat.KEY_BIT_RATE, 2000000)
outFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30)
outFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 15)
outFormat.setString(MediaFormat.KEY_MIME, mime)

// Init encoder
encoder = MediaCodec.createEncoderByType(outFormat.getString(MediaFormat.KEY_MIME))
encoder.configure(outFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
inputSurface = encoder.createInputSurface()

// Prepare EGL context by using the inputSurface we've got from encoder
eglDisplay = EGL14.eglGetDisplay(EGL14.EGL_DEFAULT_DISPLAY)
if (eglDisplay == EGL14.EGL_NO_DISPLAY)
    throw RuntimeException("eglDisplay == EGL14.EGL_NO_DISPLAY: "
            + GLUtils.getEGLErrorString(EGL14.eglGetError()))

val version = IntArray(2)
if (!EGL14.eglInitialize(eglDisplay, version, 0, version, 1))
    throw RuntimeException("eglInitialize(): " + GLUtils.getEGLErrorString(EGL14.eglGetError()))

val attribList = intArrayOf(
    EGL14.EGL_RED_SIZE, 8,
    EGL14.EGL_GREEN_SIZE, 8,
    EGL14.EGL_BLUE_SIZE, 8,
    EGL14.EGL_ALPHA_SIZE, 8,
    EGL14.EGL_RENDERABLE_TYPE, EGL14.EGL_OPENGL_ES2_BIT,
    EGLExt.EGL_RECORDABLE_ANDROID, 1,
    EGL14.EGL_NONE
)
val configs = arrayOfNulls<EGLConfig>(1)
val nConfigs = IntArray(1)
if (!EGL14.eglChooseConfig(eglDisplay, attribList, 0, configs, 0, configs.size, nConfigs, 0))
    throw RuntimeException(GLUtils.getEGLErrorString(EGL14.eglGetError()))

var err = EGL14.eglGetError()
if (err != EGL14.EGL_SUCCESS)
    throw RuntimeException(GLUtils.getEGLErrorString(err))

val ctxAttribs = intArrayOf(
    EGL14.EGL_CONTEXT_CLIENT_VERSION, 2,
    EGL14.EGL_NONE
)
val eglContext = EGL14.eglCreateContext(eglDisplay, configs[0], EGL14.EGL_NO_CONTEXT, ctxAttribs, 0)

err = EGL14.eglGetError()
if (err != EGL14.EGL_SUCCESS)
    throw RuntimeException(GLUtils.getEGLErrorString(err))

val surfaceAttribs = intArrayOf(
    EGL14.EGL_NONE
)

eglSurface = EGL14.eglCreateWindowSurface(eglDisplay, configs[0], inputSurface, surfaceAttribs, 0)

err = EGL14.eglGetError()
if (err != EGL14.EGL_SUCCESS)
    throw RuntimeException(GLUtils.getEGLErrorString(err))

if (!EGL14.eglMakeCurrent(eglDisplay, eglSurface, eglSurface, eglContext))
    throw RuntimeException("eglMakeCurrent(): " + GLUtils.getEGLErrorString(EGL14.eglGetError()))

// Prepare a texture handle for SurfaceTexture
val textureHandles = IntArray(1)
GLES20.glGenTextures(1, textureHandles, 0)
GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, textureHandles[0])

surfaceTexture = SurfaceTexture(textureHandles[0])

// The onFrameAvailable() callback will be called from our HandlerThread
val thread = HandlerThread("FrameHandlerThread")
thread.start()

surfaceTexture.setOnFrameAvailableListener({
    synchronized(lock) {

        // New frame available before the last frame was process...we dropped some frames
        if (frameAvailable)
            Log.d(TAG, "Frame available before the last frame was process...we dropped some frames")

        frameAvailable = true
        lock.notifyAll()
    }
}, Handler(thread.looper))

// Create our output surface for decoder
outputSurface = Surface(surfaceTexture)

// Finish decoder configuration with our outputSurface
decoder = MediaCodec.createDecoderByType(inFormat.getString(MediaFormat.KEY_MIME))
decoder.configure(inputFormat, outputSurface, null, 0)

In the code above I merged the steps for initial configuration from my previous post. Here's a little summarization of what I did in the code above

  1. I configured the encoder from which I could get an input Surface
  2. I used the input Surface for the EGL context setup. EGL context was necessary for calling OpenGL ES functions
  3. Once I made the EGL context current, I called OpenGL methods to get a texture id/handle and I bound it to GL_TEXTURE_EXTERNAL_OES. This allows me to access decoded frames inside of OpenGL as a texture.
  4. I passed the texture handle from step 3 to SurfaceTexture's constructor to get an instance of SurfaceTexture. I then set up additional callback that notifies me when a new frame is available.
  5. Once I have a SurfaceTexture, I can then use it to create an output Surface for configuring the decoder.

Decoding, Encoding, Muxing

After the previous setup, you can start the decoder, encoder, and prepare the MediaMuxer. MediaMuxer will create the final video output file

encoder.start()
decoder.start()
muxer = MediaMuxer("/path/to/out.mp4", MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

And here's how you perform extracting, decoding, editing, encoding, and muxing

var allInputExtracted = false
var allInputDecoded = false
var allOutputEncoded = false

val timeoutUs = 10000L
val bufferInfo = MediaCodec.BufferInfo()
var trackIndex = -1

while (!allOutputEncoded) {
    // Feed input to decoder
    if (!allInputExtracted) {
        val inBufferId = decoder.dequeueInputBuffer(timeoutUs)
        if (inBufferId >= 0) {
            val buffer = decoder.getInputBuffer(inBufferId)
            val sampleSize = extractor.readSampleData(buffer, 0)

            if (sampleSize >= 0) {
                decoder.queueInputBuffer(
                    inBufferId, 0, sampleSize,
                    extractor.sampleTime, extractor.sampleFlags
                )

                extractor.advance()
            } else {
                decoder.queueInputBuffer(
                    inBufferId, 0, 0,
                    0, MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )
                allInputExtracted = true
            }
        }
    }

    var encoderOutputAvailable = true
    var decoderOutputAvailable = !allInputDecoded

    while (encoderOutputAvailable || decoderOutputAvailable) {
        // Drain Encoder & mux to output file first
        val outBufferId = encoder!!.dequeueOutputBuffer(bufferInfo, timeoutUs)

        if (outBufferId >= 0) {
            val encodedBuffer = encoder!!.getOutputBuffer(outBufferId)

            muxer.writeSampleData(trackIndex, encodedBuffer, bufferInfo)

            encoder.releaseOutputBuffer(outBufferId, false)

            // Are we finished here?
            if ((bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                allOutputEncoded = true
                break
            }
        } else if (outBufferId == MediaCodec.INFO_TRY_AGAIN_LATER) {
            encoderOutputAvailable = false
        } else if (outBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            trackIndex = muxer.addTrack(encoder.outputFormat)
            muxer.start()
        }

        if (outBufferId != MediaCodec.INFO_TRY_AGAIN_LATER)
            continue

        // Get output from decoder and feed it to encoder
        if (!allInputDecoded) {
            val outBufferId = decoder.dequeueOutputBuffer(bufferInfo, timeoutUs)
            if (outBufferId >= 0) {
                val render = bufferInfo.size > 0
                // Give the decoded frame to SurfaceTexture (onFrameAvailable() callback should
                // be called soon after this)
                decoder.releaseOutputBuffer(outBufferId, render)
                if (render) {
                    // Wait till new frame available after onFrameAvailable has been called
                    synchronized(lock) {
                        while (!frameAvailable) {
                            lock.wait(500)
                            if (!frameAvailable)
                                Log.e(TAG,"Surface frame wait timed out")
                        }
                        frameAvailable = false
                    }

                    surfaceTexture.updateTexImage()
                    surfaceTexture.getTransformMatrix(texMatrix)

                    // Render video frame as a texture with OpenGL ES
                    // ...

                    // Render the text in OpenGL ES
                    // ...

                    EGLExt.eglPresentationTimeANDROID(eglDisplay, eglSurface, 
                        bufferInfo.presentationTimeUs * 1000)

                    EGL14.eglSwapBuffers(eglDisplay, eglSurface)
                }

                // Did we get all output from decoder?
                if ((bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                    allInputDecoded = true
                    encoder.signalEndOfInputStream()
                }

            } else if (outBufferId == MediaCodec.INFO_TRY_AGAIN_LATER) {
                decoderOutputAvailable = false
            }
        }
    }
}

The code above performs the steps I showed in High-Level Overview section. The nested loops are necessary because you might not automatically get one frame out of decoder right after you fed a buffer into it. You first need to fill up decoder's input buffer and then check if output buffer with the decoded data is actually available. Same goes for encoder - you can see that I'm draining it in a loop.

Once I get the final input from from MediaExtractor, I queue an empty buffer to signal the end of stream. This comes out at the other end of this pipe and I do the same thing to the encoder (but for encoder I use signalEndOfInputStream() because I'm using it with a Surface).

I skipped the drawing code to make this section a bit shorter and more focused. I'll show how to render the video frames and text in the following sections.

Rendering Video Frame in OpenGL

The SurfaceTexture and Surface initialized in the previous step will allow you to access decoded video frames as a texture through OpenGL ES2.

The texture can be accessed inside of your fragment shader through the samplerExternalOES variable type. You can use the GL_TEXTURE_EXTERNAL_OES target in your OpenGL code to configure texture parameters.

Here's the OpenGL initialization code that you should execute after you have made the EGL context current for your processing thread

private val vertexShaderCode =
    """
    precision highp float;
    attribute vec3 vertexPosition;
    attribute vec2 uvs;
    varying vec2 varUvs;
    uniform mat4 texMatrix;
    uniform mat4 mvp;

    void main()
    {
        varUvs = (texMatrix * vec4(uvs.x, uvs.y, 0, 1.0)).xy;
        gl_Position = mvp * vec4(vertexPosition, 1.0);
    }
    """

private val fragmentShaderCode =
    """
    #extension GL_OES_EGL_image_external : require
    precision mediump float;

    varying vec2 varUvs;
    uniform samplerExternalOES texSampler;

    void main()
    {
        // Convert to grayscale here
        vec4 c = texture2D(texSampler, varUvs);
        float gs = 0.299*c.r + 0.587*c.g + 0.114*c.b;
        gl_FragColor = vec4(gs, gs, gs, c.a);
    }
    """

private var vertices = floatArrayOf(
    // x, y, z, u, v
    -1.0f, -1.0f, 0.0f, 0f, 0f,
    -1.0f, 1.0f, 0.0f, 0f, 1f,
    1.0f, 1.0f, 0.0f, 1f, 1f,
    1.0f, -1.0f, 0.0f, 1f, 0f
)

private var indices = intArrayOf(
    2, 1, 0, 0, 3, 2
)

private var program: Int
private var vertexHandle: Int = 0
private var bufferHandles = IntArray(2)
private var uvsHandle: Int = 0
private var texMatrixHandle: Int = 0
private var mvpHandle: Int = 0
private var samplerHandle: Int = 0
private val textureHandles = IntArray(1)

private var vertexBuffer: FloatBuffer = ByteBuffer.allocateDirect(vertices.size * 4).run {
    order(ByteOrder.nativeOrder())
    asFloatBuffer().apply {
        put(vertices)
        position(0)
    }
}

private var indexBuffer: IntBuffer = ByteBuffer.allocateDirect(indices.size * 4).run {
    order(ByteOrder.nativeOrder())
    asIntBuffer().apply {
        put(indices)
        position(0)
    }
}

...

init {
    // Create program
    val vertexShader: Int = loadShader(GLES20.GL_VERTEX_SHADER, vertexShaderCode)
    val fragmentShader: Int = loadShader(GLES20.GL_FRAGMENT_SHADER, fragmentShaderCode)

    program = GLES20.glCreateProgram().also {
        GLES20.glAttachShader(it, vertexShader)
        GLES20.glAttachShader(it, fragmentShader)
        GLES20.glLinkProgram(it)

        vertexHandle = GLES20.glGetAttribLocation(it, "vertexPosition")
        uvsHandle = GLES20.glGetAttribLocation(it, "uvs")
        texMatrixHandle = GLES20.glGetUniformLocation(it, "texMatrix")
        mvpHandle = GLES20.glGetUniformLocation(it, "mvp")
        samplerHandle = GLES20.glGetUniformLocation(it, "texSampler")
    }

    // Initialize buffers
    GLES20.glGenBuffers(2, bufferHandles, 0)

    GLES20.glBindBuffer(GLES20.GL_ARRAY_BUFFER, bufferHandles[0])
    GLES20.glBufferData(GLES20.GL_ARRAY_BUFFER, vertices.size * 4, vertexBuffer, GLES20.GL_DYNAMIC_DRAW)

    GLES20.glBindBuffer(GLES20.GL_ELEMENT_ARRAY_BUFFER, bufferHandles[1])
    GLES20.glBufferData(GLES20.GL_ELEMENT_ARRAY_BUFFER, indices.size * 4, indexBuffer, GLES20.GL_DYNAMIC_DRAW)

    // Init texture that will receive decoded frames
    GLES20.glGenTextures(1, textureHandles, 0)
    GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, textureHandles[0])
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MIN_FILTER,
        GLES20.GL_NEAREST)
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_MAG_FILTER,
        GLES20.GL_LINEAR)
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_S,
        GLES20.GL_CLAMP_TO_EDGE)
    GLES20.glTexParameteri(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, GLES20.GL_TEXTURE_WRAP_T,
        GLES20.GL_CLAMP_TO_EDGE)

    // Ensure I can draw transparent stuff that overlaps properly
    GLES20.glEnable(GLES20.GL_BLEND)
    GLES20.glBlendFunc(GLES20.GL_SRC_ALPHA, GLES20.GL_ONE_MINUS_SRC_ALPHA)
}

private fun loadShader(type: Int, shaderCode: String): Int {
    return GLES20.glCreateShader(type).also { shader ->
        GLES20.glShaderSource(shader, shaderCode)
        GLES20.glCompileShader(shader)
    }
}
...

This initial setup should allow you to render the video frame texture inside of a quad. I'm also passing an MVP matrix with my transformations and a special transformation matrix provided by SurfaceTexture that allows me to transform texture UVs.

You can see from the code in the previous section that I first try to wait till a new frame is available, before I perform OpenGL rendering. When there's a new frame, I get notified by the OnFrameAvailableListener. Once the listener has been called, I can execute SurfaceTexture.updateTexImage() which binds the new video frame to GL_TEXTURE_EXTERNAL_OES target. Note that updateTexImage() should only be called from your GL thread (where you made EGL context current).

OpenGL rendering code can then be executed right after you call getTransformMatrix() (I left a placeholder comment for this in the previous section of the blogpost)

GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT or GLES20.GL_DEPTH_BUFFER_BIT)
GLES20.glClearColor(0f, 0f, 0f, 0f)

GLES20.glViewport(0, 0, viewportWidth, viewportHeight)

GLES20.glUseProgram(program)

// Pass transformations to shader
GLES20.glUniformMatrix4fv(texMatrixHandle, 1, false, texMatrix, 0)
GLES20.glUniformMatrix4fv(mvpHandle, 1, false, mvpMatrix, 0)

// Prepare buffers with vertices and indices & draw
GLES20.glBindBuffer(GLES20.GL_ARRAY_BUFFER, bufferHandles[0])
GLES20.glBindBuffer(GLES20.GL_ELEMENT_ARRAY_BUFFER, bufferHandles[1])

GLES20.glEnableVertexAttribArray(vertexHandle)
GLES20.glVertexAttribPointer(vertexHandle, 3, GLES20.GL_FLOAT, false, 4 * 5, 0)

GLES20.glEnableVertexAttribArray(uvsHandle)
GLES20.glVertexAttribPointer(uvsHandle, 2, GLES20.GL_FLOAT, false, 4 * 5, 3 * 4)

GLES20.glDrawElements(GLES20.GL_TRIANGLES, 6, GLES20.GL_UNSIGNED_INT, 0)

You can use the mvpMatrix to perform various video effects, like, rotation, or frame scaling. In my case I just set it to identity

val mvp = FloatArray(16)
Matrix.setIdentityM(mvp, 0)

// Perform additional transformations
// Matrix.scaleM(mvp, 0, 1f, -1f, 1f)
//

Rendering the Text on Top of Video

To render the text, it first needs to be converted into a texture, so that I can use it through OpenGL.

Android provides the Paint API which allows you draw text on a Canvas. Canvas allows you to draw onto a Bitmap. The Bitmap can then be easily used as OpenGL texture.

If your text is not changing every frame, it's better to have your Bitmap variable with text prepared before drawing, outside of the drawing loop

// I want to get a texture with following width/height
val width = 640
val heigh = 480

val paint = Paint(ANTI_ALIAS_FLAG)

// Pick an initial size to calculate the requested size later
paint.textSize = 62f

// Configure your text properties
paint.color = Color.RED
paint.textAlign = Paint.Align.LEFT // This affects the origin of x in Canvas.drawText()
// setTypeface(), setUnderlineText(), ....

// After setting parameters that could affect the size and position,
// now try to fit text within requested bitmap width & height
val bounds = Rect()
paint.getTextBounds(text, 0, text.length, bounds)

// Fit to requested width
paint.textSize = paint.textSize * width.toFloat() / bounds.width()

// Or fit to height
// paint.textSize = ceil(paint.textSize * height.toDouble() / bounds.height()).toFloat()

val bitmap = Bitmap.createBitmap(width, height, Bitmap.Config.ARGB_8888)
val canvas = Canvas(bitmap)

// Measure once again to get current top, left position, so that
// we can position the final text from fop left corner
paint.getTextBounds(text, 0, text.length, bounds)

canvas.drawText(text, -bounds.left.toFloat(), -bounds.top.toFloat(), paint)

You still might need to add some padding to make sure your text won't be cropped. When I was testing this, I've seen some cropping when there was a big relative difference between dimensions (e.g. tiny width like 800x50). The pixels dimensions used in Paint and Canvas are usually floats and Rect is using int and conversion could cause some inaccuracies. Not sure...I should probably check the APIs more thoroughly to find out what's going on.

The OpenGL code for rendering a Bitmap is similar to my previous section. I don't want to duplicate too much code, so I'll try to show only the main differences (look at my previous post for more details.

The vertex and fragment shaders are slightly different now. I'm not using the texture matrix provided by SurfaceTexture anymore and I changed the sampler type to the regular sampler2D used for textures

private val vertexShaderCode =
    """                       
    precision highp float;
    attribute vec3 vertexPosition;
    attribute vec2 uvs;
    varying vec2 varUvs;
    uniform mat4 mvp;

    void main()
    {
        varUvs = uvs;
        gl_Position = mvp * vec4(vertexPosition, 1.0);
    }
    """

private val fragmentShaderCode =
    """
    precision mediump float;         
    varying vec2 varUvs;
    uniform sampler2D texSampler;

    void main()
    {
        gl_FragColor = texture2D(texSampler, varUvs);
    }
    """

OpenGL initialization code is almost the same. I don't need the texture matrix from SurfaceTexture to draw the text, so I removed the part where I'm initializing the handle to the texMatrix uniform. I also changed the target when configuring textures to GL_TEXTURE_2D

...
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, textureHandles[0])
GLES20.glTexParameteri(GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_MIN_FILTER,
    GLES20.GL_NEAREST)
...

When rendering the video frame with OpenGL in the previous section, the system made the texture available inside of the fragment shader once I called updateTexImage(). Now we need to bind the Bitmap with text to our texture target directly through the OpenGL API

...
GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, textureHandles[0])        
GLES20.glPixelStorei(GLES20.GL_UNPACK_ALIGNMENT, 1)

GLUtils.texImage2D(GLES20.GL_TEXTURE_2D, 0, bitmap, 0)
...

The drawing code is basically the same. We only use the regular GL_TEXTURE_2D target and we also need to supply the texture to the shader ourselves.

GLES20.glUseProgram(program)

// Pass transformations to shader
GLES20.glUniformMatrix4fv(mvpHandle, 1, false, mvpMatrix, 0)

// Prepare texture according to what we use. If we use a bitmap, we need to pass it to
// shader ourselves.
GLES20.glActiveTexture(GLES20.GL_TEXTURE0)

GLES20.glBindTexture(GLES20.GL_TEXTURE_2D, textureHandles[0])
GLES20.glPixelStorei(GLES20.GL_UNPACK_ALIGNMENT, 1)
GLUtils.texImage2D(GLES20.GL_TEXTURE_2D, 0, it, 0)

GLES20.glTexParameteri(target, GLES20.GL_TEXTURE_MIN_FILTER,
    GLES20.GL_NEAREST)
GLES20.glTexParameteri(target, GLES20.GL_TEXTURE_MAG_FILTER,
    GLES20.GL_LINEAR)
GLES20.glTexParameteri(target, GLES20.GL_TEXTURE_WRAP_S,
    GLES20.GL_CLAMP_TO_EDGE)
GLES20.glTexParameteri(target, GLES20.GL_TEXTURE_WRAP_T,
    GLES20.GL_CLAMP_TO_EDGE)

// Prepare buffers with vertices and indices & draw
GLES20.glBindBuffer(GLES20.GL_ARRAY_BUFFER, bufferHandles[0])
GLES20.glBindBuffer(GLES20.GL_ELEMENT_ARRAY_BUFFER, bufferHandles[1])

GLES20.glEnableVertexAttribArray(vertexHandle)
GLES20.glVertexAttribPointer(vertexHandle, 3, GLES20.GL_FLOAT, false, 4 * 5, 0)

GLES20.glEnableVertexAttribArray(uvsHandle)
GLES20.glVertexAttribPointer(uvsHandle, 2, GLES20.GL_FLOAT, false, 4 * 5, 3 * 4)

GLES20.glDrawElements(GLES20.GL_TRIANGLES, 6, GLES20.GL_UNSIGNED_INT, 0)

Conclusion

Android offers a rich set of APIs that allow you to play with video and various media formats. In this post I showed how to use only the native APIs (without external libraries) to add a text layer on top of an existing video.

If you want to learn more about video processing on Android, I recommend to check my other posts also.

Android supports the ETC1 texture compression format and I've created a web version of the etc1tool from AOSP that can convert PNG files into ETC1 compressed textures. I recommend to check it out.

Additionally, if you're a developer or an Android enthusiast that hacks around Android devices with tools like ADB or fastboot, I also recommend to check out my Bugjaeger app. Bugjager allows you to perform many of the tasks you normally do from your development computer through ADB/fastboot directly from another Android device. One related feature is for example capturing the screen in realtime which underneath also uses the MediaCodec API.

Next Post Previous Post

Add a comment

Comments

Thank you for the great post. This is working fine in the newer devices. But getting black screen on the older devices (like samsung galaxy J2 and huawei) with the error "W/GLConsumer: [SurfaceTexture-1-28194-1] bindTextureImage: clearing GL error: 0x500". How can i fix this?
Written on Thu, 16 Jul 2020 21:13:10 by Raj