Class VLMBlock
- Namespace
- VisioForge.Core.MediaBlocks.AI
- Assembly
- VisioForge.Core.AI.dll
A vision-language media block that runs a Florence-2 ONNX pipeline on video frames to caption them, detect objects, run OCR, or ground phrases (selected by VisioForge.Core.Types.X.AI.VLMSettings.Task). Implements the VisioForge.Core.MediaBlocks.MediaBlock. Implements the VisioForge.Core.MediaBlocks.IMediaBlockInternals.
public class VLMBlock : MediaBlock, IMediaBlockInternals, IVideoProcessingBlock, IMediaBlock, IDisposableInheritance
-
MediaBlock
Implements
-
IMediaBlockInternalsIVideoProcessingBlockIMediaBlock
Inherited Members
-
MediaBlock._isBuiltMediaBlock._pipelineMediaBlock._pipelineCtxMediaBlock.GetPipelineContext()MediaBlock.SetPipelineContext(BlockPipelineContext)MediaBlock.SetPipeline(MediaBlocksPipeline)MediaBlock.ContextMediaBlock.NameMediaBlock.IsBuiltMediaBlock.OwnerMediaBlock.TypeMediaBlock.IDMediaBlock.InputMediaBlock.InputsMediaBlock.OutputMediaBlock.OutputsMediaBlock.HasInputsMediaBlock.HasOutputsMediaBlock.Build()MediaBlock.CreateElements()MediaBlock.AddElementsToPipeline()MediaBlock.RemoveElementsFromPipeline()MediaBlock.DeepCopy(string)MediaBlock.Reset()MediaBlock.ToYAMLBlock()MediaBlock.ClearPads()MediaBlock.Dispose(bool)MediaBlock.Dispose()
Remarks
Inference is expensive (a full autoregressive generation per frame), so it runs on a background worker and is throttled by VisioForge.Core.Types.X.AI.VLMSettings.ProcessingInterval (gated on the frame timestamp): at most one frame per interval is generated, and other frames only redraw the most recent result. Decoding is greedy (argmax); beam search is not implemented. The VisioForge.Core.MediaBlocks.AI.VLMBlock.Task and VisioForge.Core.MediaBlocks.AI.VLMBlock.TextInput properties can be changed while the pipeline runs and take effect on the next inference.
Constructors
VLMBlock(VLMSettings)
Initializes a new instance of the VisioForge.Core.MediaBlocks.AI.VLMBlock class.
public VLMBlock(VLMSettings settings)Parameters
settingsVLMSettings-
The VLM settings. Must specify the four Florence-2 model paths and tokenizer files.
Exceptions
- ArgumentNullException
-
Thrown when
settingsisnull.
Properties
ActiveProvider
Gets the execution provider that the model sessions actually engaged. Valid after the block has been built; reports VisioForge.Core.Types.X.AI.OnnxExecutionProvider.CPU otherwise.
public OnnxExecutionProvider ActiveProvider { get; }Property Value
- OnnxExecutionProvider
DroppedFrameCount
Gets the number of frames the analysis had to discard. Stays 0 under the sample-latest design: the video always passes through untouched and the worker simply captions the freshest frame once the previous generation finishes, so nothing is dropped. Use VisioForge.Core.MediaBlocks.AI.VLMBlock.LastInferenceTimeMs to gauge model speed.
public long DroppedFrameCount { get; }Property Value
Input
Gets the primary input pad.
public override MediaBlockPad Input { get; }Property Value
- MediaBlockPad
Inputs
Gets the array of all input pads.
public override MediaBlockPad[] Inputs { get; }Property Value
- MediaBlockPad[]
LastInferenceTimeMs
Gets the wall-clock time, in milliseconds, the most recent generation took.
public float LastInferenceTimeMs { get; }Property Value
Output
Gets the primary output pad.
public override MediaBlockPad Output { get; }Property Value
- MediaBlockPad
Outputs
Gets the array of all output pads.
public override MediaBlockPad[] Outputs { get; }Property Value
- MediaBlockPad[]
Task
Gets or sets the task the model performs on each processed frame. Applied on the next inference.
public VLMTask Task { get; set; }Property Value
- VLMTask
TextInput
Gets or sets the auxiliary text input used by VisioForge.Core.Types.X.AI.VLMTask.PhraseGrounding. Applied on the next inference.
public string TextInput { get; set; }Property Value
Type
Gets the type of the media block.
public override MediaBlockType Type { get; }Property Value
- MediaBlockType
Methods
Build()
Constructs the internal GStreamer elements and pads for this block.
public override bool Build()Returns
- bool
-
trueif successful,falseotherwise.
CleanUp()
Cleans up internal resources, specifically the GStreamer element.
public void CleanUp()Dispose(bool)
Releases unmanaged and - optionally - managed resources.
protected override void Dispose(bool disposing)Parameters
disposingbool-
trueto release both managed and unmanaged resources;falseto release only unmanaged resources.
GetCore()
Gets the core GStreamer element wrapped by this block.
public BaseElement GetCore()Returns
- BaseElement
-
The VisioForge.Core.GStreamer.Base.BaseElement wrapper, or
nullif not built yet.
GetElement()
Gets the GStreamer element instance.
public Element GetElement()Returns
- Element
-
The Gst.Element, or
nullif not built yet.
IMediaBlockInternals.SetContext(MediaBlocksPipeline)
Sets the pipeline context for this block.
void IMediaBlockInternals.SetContext(MediaBlocksPipeline pipeline)Parameters
pipelineMediaBlocksPipeline-
The pipeline.
OnResultGenerated
Event raised when the model finishes an inference on a frame.
public event EventHandler<VLMResultGeneratedEventArgs> OnResultGeneratedEvent Type
- EventHandler<VLMResultGeneratedEventArgs>
See Also
-
MediaBlockIMediaBlockInternals