React Native ExecuTorch v0.8.0 – A Library Milestone
Mateusz Kopciński•Apr 3, 2026•5 min readWith React Native ExecuTorch v0.8.0, we’re pushing that boundary further. This is our biggest release so far, packed with major improvements and new capabilities. Let’s take a closer look at what’s inside!
Computer Vision Meets the Camera
runOnFrame worklet that plugs directly into VisionCamera v5, meaning you can run segmentation, detection, or classification on live camera frames with zero extra plumbing.const model = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
const [detections, setDetections] = useState<Detection[]>([]);
const frameOutput = useFrameOutput({
pixelFormat: 'rgb',
dropFramesWhileBusy: true,
onFrame: useCallback(
(frame: Frame) => {
'worklet';
try {
const isFrontCamera = false; // using back camera
const result = model.runOnFrame(frame, isFrontCamera, 0.5);
if (result) {
scheduleOnRN(updateDetections, result);
}
} finally {
frame.dispose();
}
},
[model, updateDetections]
),
});- Instance segmentation lands with support for YOLO (from nano to extra-large) and RF-DETR, giving you per-pixel object masks in real time.
- Object detection picks up the same model families: YOLO and RF-DETR.
- Semantic segmentation now supports DeepLab V3, LRASPP, FCN.
- A dedicated Selfie Segmentation model. With this you can implement your own background blurring, virtual backgrounds or whatever else you can think of!
- Finally – for on-device efficiency, we’ve shipped quantized variants of CLIP, Style Transfer, EfficientNetV2, and SSDLite.
fromCustomModel, so that you can easily integrate your own models!const MyLabels = { BACKGROUND: 0, FOREGROUND: 1 } as const;
const segmentation = await SemanticSegmentationModule.fromCustomModel(
'https://example.com/custom_model.pte',
{
labelMap: MyLabels,
preprocessorConfig: {
normMean: [0.485, 0.456, 0.406],
normStd: [0.229, 0.224, 0.225],
},
}
);
const result = await segmentation.forward(imageUri);
result.ARGMAX; // Int32ArrayVision Language Models on Device
useLLM now supports multimodal input – you can pass images alongside text messages.const llm = useLLM({
modelSource: LLM.LFM2_VL_1_6B_QUANTIZED,
});
llm.sendMessage('What do you see in this image?', {
images: [imageUri],
});The first supported model is LFM 1.6B VLM, quantized and running entirely on-device. For use cases like accessibility or document understanding, this is a meaningful step – you get multimodal reasoning at the edge with a single hook.
Kokoro TTS – Streaming and Phoneme Control
const tts = useTextToSpeech({ model: TTS.KOKORO });
// As LLM generates text incrementally:
await tts.streamInsert('Hello, here is');
await tts.streamInsert('the latest');
await tts.streamStop();Kokoro TTS also picks up a new forwardFromPhonemes / streamFromPhonemes API, letting you bypass the built-in grapheme-to-phoneme pipeline and supply your own IPA strings – useful if you need fine-grained control over pronunciation.
Whisper Just Got Faster
transcirbe and stream now also return TranscriptionResult objects with word-level timestamps, making it straightforward to build features like subtitle sync or searchable audio.Better Developer Experience
react-native-executorch-expo-resource-fetcher or react-native-executorch-bare-resource-fetcherdepending on your project type, then initialize before using any hooks.mport { initExecutorch } from 'react-native-executorch';
import { resourceFetcher } from 'react-native-executorch-bare-resource-fetcher';
initExecutorch(resourceFetcher);fromModelName and fromCustomModelstatic methods, replacing the old new+load pattern. It’s a cleaner, more predictable surface across the board.Breaking Changes
- Initialization is now required – call
initExecutorchwith an explicit adapter before using any hook. - Factory methods replace constructors – use
Module.fromModelNameorModule.fromCustomModelinstead ofnew+load. - ImageSegmentation → SemanticSegmentation – update imports and hook names to
useSemanticSegmentation. - Return types have changed –
Classification.forwardnow returns a type-safe record of label names to scores. Semantic segmentation returnsiRecord<'ARMAX', Int32Array>&Record<K, Float32Array>. - Speech-to-Text – it returns
TranscriptionResultinstead of raw strings, andstreamis now an async generator. - TTS streaming uses a new API – the callbacks pattern is replaced by
streamInsert()/streamStop()methods. - LLM context management –
contextWindowLenghtis replaced bycontextStrategy.
For full details, check out our Realese Notes.
What’s Next
We’d love for you to try it out, break things, and tell us what you think. Check out the documentation, star us on GitHub, and come say hi in our community!
