Blender - the first 3D application with voice control?

anon57426511 · December 19, 2020, 8:37pm

Would be be possible to have a voice control (voice recognition) for blender? I came up with this idea while having my headset on, noticing that i sometimes whisper the shortcuts im using.
Would it make sense to have voice control over the blender commands? Saying “Add Mesh Cube” to add a cube, saying “Merge vertices at last” to merge the selectes vertices and so on.

Is something like this technically possible at all? Which Language would Blender “speak”? How would something like this adapted into the current blender workflow? Obviously you can not select vertices or do sculpting strokes by voice control. But mostly other commands may work.

I’d like to read your opinions on this. I think this functionality would make blender pretty unique and fast to work with.

Stan_Pancakes · December 19, 2020, 8:47pm

Technically this is of course possible, at least hypothetically. Practically though… Pressing M 2, or, if you care about efficiency enough to reassign the merge shortcut to something more reachable, something like Alt+X 2 - you can do that like 3-5 times in the same amount of time you would require to enunciate “Merge vertices at last” clearly enough for any voice recognition to understand (and that’s assuming it itself is fast enough to process and recognize the command and context, which is a big assumption). Same with “Add Mesh Cube”, which is Shift + A 1 2.
Voice control will never be fast, anywhere, as compared to mouse/keyboard input.

anon57426511 · December 19, 2020, 9:00pm

Stan, thanks for your answer. The given commands were of course examples for any blender command.
Regarding the reaction speed of a voice control addon, i agree with you that this is a critical point. it would maybe need some context sensitive answering algorhythm, similar to the new search. If you give a vague command, the voice control could suggest similar possilbe commands to execute.

kakapo · December 19, 2020, 9:02pm

i also thought about this. it would be cool for some commands like “left view”, “top view”,… not for modeling operators. there it really would be too slow.

i think it would be possible to do a python addon that uses windows’ speech recognition features. but i didn’t have time to look into it. also don’t know what could be used on linux.

anon57426511 · December 19, 2020, 9:05pm

Good point. It could make more sense in the “pheriphercial” Commands. “Render”, “Compositor” or file commands such as saving. commands that are not often used.

Musashidan · December 19, 2020, 9:20pm

Here’s an example of this in 3ds Max from 10 years ago. I’ve seen other, more recent Max users working this way.

From 3 years ago in Max

Here’s one I just found for Blender. His source code is on his Gumroad. Might be worth checking out(It’s 3 years old)

xan2622 · December 19, 2020, 10:45pm

Hi @anon57426511.

To send commands (hotkeys) to Blender with your voice, you can use Voice Macro (a freeware for Windows): https://www.voicemacro.net/

And if you don’t have a microphone (or a headset with a microphone), you can use your smartphone as a microphone. To make the “bridge” between your smartphone and your computer, you can use https://wolicheng.com/womic/ (free).

PierreSchiller · December 21, 2020, 5:24am

Thank you. I will dig into this. Switching workspaces and calling sculpt tools is a MUST on voice comand.

anon57426511 · December 21, 2020, 6:34am

Thanks xan,
voicemacro seems to be an interesting first start. I like the idea of giving certain blender commands to certain control words. This is like a keymap, only with words.
A Blender addon for voice control should consider that. The user should have the option put certain commands onto control word of his own like. For expample the control word for switching to the node editor is “Noodles”

I hope that more people get interessted into this topic.

Arent there any AI Based solutions on Github?

Thicc_Boi · December 21, 2020, 7:09am

It hypothetically is possible, but will need code almost as big as Blender itself. I would prefer the development of Blender, instead of the Voice Recognition, and don’t have or know anyone who has problems and will need a voice recognition stuff.
All in all, I wouldn’t want a voice-recognition feature in Blender unless its smaller, and faster
And Voicemacro seems a good start, but I will still prefer a Keyboard and a Mouse

xan2622 · December 21, 2020, 8:00am

I agree. I would also prefer the development of Blender to be focused on performance improvements, bug fixes, new features (related to 3D), etc.

I just don’t think that Voice Recognition should be implemented directly in Blender. Maybe it could be done as an add-on? (but even if it could, I think that it would be a huge task). IMO, third party tools like Voice Macro (or other softwares) already do the job pretty well.

anon57426511 · December 21, 2020, 6:50pm

Ok, thanks to all who added information to this thread. I also agree that there are more important development tasks to do in blender, and voicemacro is a good helper for the current state of the art. lets see if this one continues in ten years from now.

Ingapambi · December 21, 2020, 7:04pm

An idea. Add an adaptation of Voice Macro to cut down development time, Make it an option like other UI things. Make it limited to things as xsan2622 suggested.

Why? To make Blender available to those who may need it because of coordination problems or another malady further widening the customer base.

kakapo · December 21, 2020, 7:04pm

nobody demands to develop voice recognition from scratch for blender. that would be silly. an add-on would use an existing voice recognition library. i am not sure what the advantages over voicemacro would be though?

PierreSchiller · December 22, 2020, 10:17am

You’re totally in point!

rootaman · January 27, 2021, 2:31pm

Well, I already work in Blender with voice since 2016… because I am a pen display user with RSI so my hand will not like holding mouse or keyboard for long period of time…
https://youtu.be/LcilO5CsfHM?t=20

That approach in the above video requires only Vocola 3.1 freeware + Windows Speech Recognition which is already included in your Windows 7, or 10 + AutoHotKey to solve Shift key problem…

It is so easy to search in Blender 2.9 since the Command Search really improve and almost every operations/commands available in this. So I created my command search in Vocola code like this…

export (Collada=col|Alembic=ale|3D Studio=3ds|3DS|FBX|Wavefront=wav|Object=obj|OBJ) = BlendVCL.CommandSearch(“export $1”,0);

And I can now say “export Collada” or “export 3D Studio” or “export OBJ” or whatever type in the code. Next is my BlendVCL plugin written specially for Vocola to do the search for me and it takes less than 1 seconds to search and pick the right command for export my model.

Another sample Vocola code …

select N Gon = doCommandPort(“bpy.ops.mesh.select_face_by_sides(number=4, type=‘GREATER’, extend=True)”,“”,“”);

It will select N GON faces in my mesh when I say “select N Gon”, this command requires the Blender Command Port addon to be active when my Vocola extension send the Python code “bpy.ops.mesh.select_face_by_sides(number=4, type=‘GREATER’, extend=True)” directly to Blender. You can find this Addon on GitHub.

Here is another sample video showing how I control Blender UI internally by sending a Python code to it. I could do anything from split view, changing the size of Gizmo, show per-object mesh wireframe, show specific type of object in the view…
https://youtu.be/czlMDC_OEhQ?t=175

All codes and extension available at my GumRoad. The free set will let you use the Command Search and simple key strokes but no Python code. The extended version will have ALL functionality but I think you should try the free set first !
https://gumroad.com/l/BlenderVCL

PWD · June 3, 2021, 8:00pm

Good job on the project! I am a speech recognition researcher. If there is any pain points in using the speech recognizer, I may be of help internally. Currently I am with the Microsoft speech recognition team, but this message is personal and is not related to my affiliation or any other organization.

anon57426511 · June 4, 2021, 5:28am

Wow, Great news! Always good to see that people from the non-Blender World join this!

rootaman · June 4, 2021, 5:45am

Thanks for your reply, from my experience so far with WSR and Vocola 3.1 there are many times that the recognition engine trying to dictate my commands in stead of sending the macro/key stroke or whatever action I defined. I found another software called “VoiceMacro” has the ability to adjust the way its recognition engine work to the point that all of my spoken words turn into a commands (not a text dictation) which is what I need but the text based command style of Vocola is a more intuitive way for add/edit a lot of commands for 3D program like Blender. I dont know if this possible but I would like to “force WSR to command mode only” or turn off dictation completely… also another thing not really related to WSR but I should let you know since you said “any pain point” is the Vocola 3.1 program is no longer in development and it only support dot net frame work upto version 3.5 only and that I have a difficult writing its extension to add new features…

PWD · June 4, 2021, 11:05pm

My understanding of your first point is that the recognition engine has trouble when you say two (or more) commands with short pauses: instead of recognizing the commands as “command1”, “command2” separately, it outputs “command1 command2” as text. This is related to the voice activity detection (VAD) module, which segments sentences based on the duration of the silence. There should be a parameter tuning the sensitivity of the VAD module, which I think is what VoiceMacro supports.

As for the second point, I think it may relate to application development frameworks, which I am not very familiar with (I learned iOS application development before but haven’t touch it for years.). In general, I think it would be great to use the same framework as blender or some cross-platform frameworks.

From my perspective, the most interesting point of this direction is whether voice commands can make it easier for new users to learn the shortcuts or to use the application with more natural commands.