Control YouTube - Easy
Build a Maya skill to control YouTube within your browser
Last updated
Build a Maya skill to control YouTube within your browser
Last updated
When we're done with this tutorial, you'll have built skills to do the following without opening your browser at all -
Search and play a video on YouTube
Pause YouTube playback
Resume YouTube playback
We'll be doing this entirely with the Browser Automation module, without using any special YouTube API or anything. So we'll be going over how to do the following with the browser automation module:
Click on an element on a page (Click
node)
Open a URL in a new tab (Open
node)
Query open tabs (Find Tab
node)
Execute a function (a method, to be precise) on an element on a page (Execute Function
node)
Let's take this one skill at a time, in order.
We want this skill to search for a video and play it on YouTube, without the user ever having to switch to the browser. Let's break this down into steps. This'll do for a start -
Take query as input from the user
Search for the query on YouTube
Click on the first video result (after this, the video starts playing automatically)
Tell the command bar the skill executed successfully
First off, we need to specify what the skill prompt will say and how the user will provide input. This is what we want the skill to look like -
Drop a bot-command
node on the editor, double click on it and configure it like shown below (if this is unfamiliar to you, you might wanna go over this more basic tutorial covering the basics of skill building).
When you click "Done", the node should look like this in the editor -
The next thing we want to do is search for query
on YouTube. If you notice, searching for a query on YouTube is the same thing as navigating to youtube.com/results?search_query={query}
Let's create this URL. Drag a function
node (you'll find it at the top of the pallette) on to the editor and connect it to our bot-command
node. Double click on this node and put this code inside -
It's going to look like this when you're done:
All we're doing here is constructing the URL we want to redirect to, and adding it as a property (searchUrl
) to the msg
object forwarded by this node. We can now access this URL in other nodes.
This is what you should have at this point.
Now that we have the URL, ets configure our skill to open it in a new tab. Drag an Open
node (from the Maya Browser Automation module) to the editor, and double click on it.
The URL that we want this node to open is in the searchUrl
property of the msg
object. So, select "msg." from the left dropdown on the "url" property and put "searchUrl" in the text input on the right of the dropdown.
Click "Done" to confirm these settings, and then connect this Open node to the function node. Your flow should now look like this:
This is all we need to search for the query on YouTube. Pretty straightforward. Now let's move on to the second part of the skill, which is to click the first search result. The Open
node will add a property called tabs
to the msg
object, which is an array containing a single Tab
object corresponding to the browser tab that was just opened. A Tab
object contains these properties. This will be useful in a bit.
To click an element, we need to know it's corresponding xpath on the page.
If you're not aware of xpaths, we recommend you read this and get familiar with them. A lot of Maya's Browser Automation functionality depends on xpaths and its a powerful tool by itself, so we promise it's gonna be worth your while. You can check out our resource on xpaths here.
This is what the YouTube search results page looks like, and we want Maya to click on the highlighted thumbnail.
The corresponding element selected in the devtools is the tag <a id="thumbnail">
. The corresponding xpath for this element would then be:
Although, this xpath corresponds to every single thumbnail on the page. We just want the first of them, so we simply modify the xpath to give us just the first element:
Now that we have the xpath, we can wire up a Click
node to click on it. First, drag a Click
node to the editor and double click on it. You'll should see something like this in the node's config panel.
All the properties except selector
are filled out for us already, and we can leave them as is. Notice how tabId
is set to msg.tabs[0].id
. The msg.tabs
property was set by the Open
node that comes right before this node, remember? We access the id
of the tab we just opened through tabs[0].id
.
Once you enter the selector in the node config, it's gonna look exactly like this:
Click "Done" to close the node config and connect the Click
node to the Open
node before it. The flow should now look like below. The only thing we need to do now is to notify the command bar that the skill has finished executing.
This is easy. Just drag a bot-response
node (from the botutils
module) to the editor and connect the Click
node to it. Your skill should look like this:
That's it. Hit "Deploy" and try your skill from the command bar! Make sure you've given Maya permission to youtube.com via the extension before you try it, though.
We want this skill to allow you to pause/resume playback on YouTube without having to switch to the YouTube tab or even opening the browser. We'll do this with two skills - one to pause and one to resume, like this:
Let's build the pause skill first, the resume skill will be similar. Like before, let's break this down into steps. There are two ways we can do this -
Click Pause button on the YouTube tab
Execute the <video>
element's pause()
method on the YouTube tab
We'll take the second approach (it's a lot more reliable, and this way we get to show you how the Execute Function
node works hehe). Here's the steps we need to take -
Take input from the user
Find out which tab is playing YouTube
Execute the <video>
element's pause()
method on this tab.
Tell the command bar the skill executed successfully
This is similar to what we did with the search-and-play skill above, so let's not waste time going over it again. This is what your bot-command
node should look like -
The Execute Function
node performs a page-automation action (i.e., does something with the website's interface). All page-automation actions require you to specify the ID of the tab on which you want to perform them. In our case, we need the ID of the tab which is playing the YouTube video.
Drag a Find Tab
node to the editor and double click on it. There's just one field - "query". This is a JSON field that will contain a standard chromium tab query. You can check out what a query can contain at this MDN page. We want a tab whose URL is of the pattern *://*.youtube.com/watch*
and is producing some sound. Here's the corresponding tab query:
Click on the three dots to the right of the query field to expand it, and then enter the above query into it.
Click "Done" to confirm these settings, and connect the Find Tab
node to the bot-command
node. The Find Tab
node will set a property called tabs
on the msg
object, containing an array of Tab
objects that match the query. Ideally, only one tab should match the query (what kind of a psychopath has two audible youtube videos playing at once?), so we'll just select the first element of this array.
This is what the skill should look like at this point.
Any HTML5 video player can be paused by calling its pause()
method, and the YouTube player is no different. The Execute Function
node is used exactly for things like this, so drag it out to the editor and double click on it.
This is what you'll see.
The highlighted element here is a <video>
tag. Since there is only one <video>
element on the entire page, it's xpath is simply //video
. That's what we'll put in the "selector" field.
In the "function" field we'll put "pause", since that's the name of the function we wanna call. Since the pause()
function takes no arguments, we can just set an empty JSON array ([]
) in the "arguments" field. The "tabId" field is already set to use the value msg.tabs[0].id
, which is what we want. The node config should finally look like this -
Click on "Done" to save these node properties, and then connect the Execute Function
node to the Find Tab
node. The skill should finally look like this at this point -
All that's left now is to tell the command bar about skill completion.
Just like we did it for the search-and-play skill, drag a bot-response
node to the editor and connect it to the Execute Function
node. The final skill should look like this:
That's it! Try playing something on YouTube and then run this skill, the video should pause.
The skill to resume playback is gonna be exactly similar, except that we'll be executing the play()
function instead of the pause()
function and in our tab query we won't be looking for an audible tab (since the tab ideally won't be playing anything). For this, just remove "audible": true
key from the tab query in the Find Tab
node.
Try building this one on your own!