I am proud of myself for taking this noble work that will contribute to the society. Join me in this journey where AI cares for your loved ones. Join me to build AI Mama for your kid. These blogs are stepping stones to successfully create AI Mama for your kid.
MediaPipe demo for Human pose estimation
A brief introduction to MediaPipe
MediaPipe is an open-source framework for building cross-platform, multimodal applied machine learning pipelines. Developed by Google, it’s designed to facilitate the rapid development and deployment of machine learning-based features, with a particular focus on audio, video, and time series data.
Can I use MediaPipe for commercial use to run my business?
Yes, you can use MediaPipe for commercial purposes. MediaPipe is an open-source framework released under the Apache License 2.0, which is a permissive free software license. This means that you are allowed to use, modify, and distribute the software, including for commercial use, under the license terms.
MediaPipe can be a powerful tool for commercial projects. Still, it’s important to understand and adhere to the license terms, assess the technical suitability of your application, and be mindful of legal and regulatory considerations.
Can I run MediaPipe on a CPU machine?
Yes, you can run MediaPipe on your CPU. MediaPipe is designed to be cross-platform and flexible, allowing it to run on various hardware configurations, including standard CPUs. There is always support for GPUs or TPUs, using such hardware can enhance performance, especially for more complex models or real-time applications.
Can I run MediaPipe on macOS?
Yes, you can run MediaPipe on macOS. MediaPipe is designed to be cross-platform, supporting various operating systems including macOS, Windows, Linux, and mobile platforms like Android and iOS.
when we install MediaPipe do the pre-trained weights get downloaded?
Yes, when you install MediaPipe using pip, it downloads the pre-trained model weights along with the package. This is because MediaPipe’s models, including the pose estimation model BlazePose, are designed to work “out of the box” with minimal setup. The model weights are bundled within the MediaPipe package, allowing you to use the models immediately after installation without any additional steps to download weights.
Installation steps of MediaPipe on macOS
Machine configuration — macOS, Version — 13.6.1, The total RAM in the Mac is 16.0 GB.
To install MediaPipe on macOS, follow these steps. Note that the process primarily involves setting up a Python environment and installing the MediaPipe package through pip. Here’s a detailed walkthrough:
Prerequisites
- Python: Ensure you have Python installed. MediaPipe works well with Python 3.6 or later. You can check your Python version by running
python3 --version
in the Terminal. - The current version of Python installed in my conda environment is Python 3.11.6.
- pip: Make sure pip (Python package manager) is installed. If not, you can install it alongside Python from the official Python website.
Installation Steps
- I am using an Anaconda environment to install all the packages required to run MediaPipe human pose estimation.
- Open Terminal and create an Anaconda environment by using the below command. Update pip (Optional): It’s a good practice to have the latest version of pip. Update it using the following command:
conda create --name mediapipe python=3.11
conda activate mediapipe
python3 -m pip install --upgrade pip
3. Install MediaPipe:
With the virtual environment activated, run the below command:
Everything should run smoothly and in no time MediaPipe will be installed in your virtual environment.
pip install mediapipe
To ensure the correct installation of the MediaPipe, run the below commands.
python3
>>> import mediapipe as mp
4. Install OpenCV
Use the below command to install OpenCV as we will need it at multiple steps.
pip install opencv-python
Here, is what I have installed in my Anaconda virtual environment.
Package Version
----------------------- --------
absl-py 2.0.0
attrs 23.1.0
cffi 1.16.0
contourpy 1.2.0
cycler 0.12.1
flatbuffers 23.5.26
fonttools 4.47.0
kiwisolver 1.4.5
matplotlib 3.8.2
mediapipe 0.10.9
numpy 1.26.2
opencv-contrib-python 4.8.1.78
opencv-python 4.8.1.78
packaging 23.2
Pillow 10.1.0
pip 23.3.2
protobuf 3.20.3
pycparser 2.21
pyparsing 3.1.1
PyQt6 6.5.2
PyQt6-3D 6.5.0
PyQt6-Charts 6.5.0
PyQt6-DataVisualization 6.5.0
PyQt6-NetworkAuth 6.5.0
PyQt6-sip 13.5.2
PyQt6-WebEngine 6.5.0
python-dateutil 2.8.2
setuptools 68.2.2
six 1.16.0
sounddevice 0.4.6
wheel 0.41.2
The input video:
The output Video:
The script:
The default model used in the mp.solutions.pose
the module of MediaPipe for pose estimation is known as BlazePose. BlazePose is capable of detecting 33 body landmarks in real-time on both single-person and multiple people. It's designed to be lightweight while still providing high accuracy, which enables it to run efficiently on-device across different platforms, including mobile and desktop.
import os
import cv2
import mediapipe as mp
input_video_path = '/Users/prajendr/Downloads/test_data.mp4'
mediapipe_outdir = '/Users/prajendr/leaddatascientist/data/mediapipe_output/'
output_video_path = mediapipe_outdir + 'video.mp4'
if not os.path.exists(mediapipe_outdir):
os.makedirs(mediapipe_outdir)
# Initialize MediaPipe Pose.
mp_pose = mp.solutions.pose
pose = mp_pose.Pose()
# Initialize MediaPipe drawing module for annotations.
mp_drawing = mp.solutions.drawing_utils
# Open the local video file.
cap = cv2.VideoCapture(input_video_path)
# Get video properties for output file.
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
# Define the codec and create VideoWriter object to save the output video.
out = cv2.VideoWriter(output_video_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (frame_width, frame_height))
while cap.isOpened():
success, image = cap.read()
if not success:
break
# Convert the BGR image to RGB and process it with MediaPipe Pose.
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pose.process(image)
# Draw the pose annotations on the image.
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
if results.pose_landmarks:
mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)
# Write the frame into the output file.
out.write(image)
# # Display the annotated image (optional, can be commented out).
# cv2.imshow('MediaPipe Pose', image)
# # Break the loop when 'q' is pressed.
# if cv2.waitKey(1) & 0xFF == ord('q'):
# break
# Release resources.
cap.release()
out.release()
cv2.destroyAllWindows()
How It Works:
- Set Video Paths: Change
input_video_path
the path of your local video file andoutput_video_path
to where you want to save the output video. - Video Capture and Writer: The script reads frames from the input video and initializes an
cv2.VideoWriter
object to write the processed frames to an output video. - Process Each Frame: For each frame, convert it to RGB, process it with MediaPipe Pose, and then draw the pose landmarks.
- Save Output: Each annotated frame is written to the output video file.
- Display and Quit: The processed frames are displayed in a window. Press ‘q’ to quit prematurely. This step is optional and can be removed if not needed.
- Cleanup: Release the video capture and writer objects, and destroy any OpenCV windows.
Conclusion:
There are no limits to the possibility of good work for humanity.
If you feel driven and touched by the cause of this AI project then, join me in this journey to build AI Mama for your kid.
I am publishing this blog with the MediaPipe warmup script. MediaPipe has opened new avenues for innovation and problem-solving. There is so much more to do with MediaPipe. Let’s keep working and exploring the best solution for the AI Mama.