Not completely open source

Michael_Speth · November 28, 2015, 7:17pm

I have read that parts of the backend are using proprietary libs which means the backend is most likely going to be proprietary.

How can you possibly say the mycroft project is open source if the backend is proprietary? Is it dishonest in order to get funding on ks?

ryanleesipes · December 2, 2015, 5:26am

Where did you read that? None of the backend will be proprietary. We may have to use services like Wit.ai or others for STT as we train an open source model. But we are aiming to ensure the entire stack is open source.

ryanleesipes · December 2, 2015, 6:28am

@Michael_Speth I realized we have not been clear on how everything is going, so I went ahead and posted the roadmap and answered these question in this post: https://community.openconversational.ai/t/the-mycroft-roadmap-and-open-source

Michael_Speth · December 2, 2015, 6:54am

From an outsider, I hear open source, than read another post that stated your hooking into 3rd party closed source libs. This begs the question can you legally open source (GPLv3) the backend due to the licensing of the 3rd party libs?

I am also very sceptical because the project says ‘open source’ but I haven’t seen any development nor code.

You say your going open source, but why isn’t your development process transparent? I really hope your using a project management tool. If your not, I highly suggest using jira. Its free for open source projects. You can sign up here:
https://developer.atlassian.com/opensource/

Open source projects should be tracked in the public space. This allows the community to be fully involved from developers that contribute code to users submitting bugs and feature request.

Confluence is also free for open source - its a great wiki tool.

git repos: github and bitbucket are the most popular ones - again free for open source projects (public).

As for continuous integration tools: Atlassian has their bamboo which hooks into AWS nicely. Travis-CI works well with github.

seanfitz · December 2, 2015, 7:50am

Hey @Michael_Speth, thanks for the suggestions!

We’re currently getting our ducks in a row on open sourcing various components of the stack. It’s always a struggle to balance documentation with active development, and we want to be sure the quality bar is high for our first release. I’ll be working this sprint on prepping Adapt for release, and look forward to any feedback on the framework.

As for tools, we’ve made a good amount of progress there, which I’m happy to share.

For source control, we’re using github. We’ll be opening up individual repos as we think they’re ready for public consumption. We know that if it’s perfect, we waited way too long, so we’re trying to balance that urge with the desire to release our cool shiz to the community.

For project management, after experimenting with a couple different platforms, we settled on Jira about a month ago, and have managed our last two sprints (as well as arbitrary internal tasks) with it. I’ve used it professionally a number of times in the past, and there’s really nothing better. “Industry standard” is the term that comes to mind. It has a number of integration points with Github (though, admittedly, bitbucket integrations are better).

For CI, we’re using Jenkins. It’s not the flashiest of tools, but it’s lightweight enough that we can run a RasPI as a slave for on-device integration tests and metrics collection. That’s been a massive boon.

For each of these tools, we’re evaluating how we open up the relevant portions to the public; Mycroft is a company, and we’re using these tools for the company as well as the project. The general public should absolutely see the pull requests on our init scripts, but not necessarily the jira issues related to getting VPN tokens shipped to devs.

As for the use of 3rd party closed source libraries, I think there may be a terminology kerfuffle brewing. The proprietary services that have been discussed for development purposes are publicly (and mostly freely) available web services from various providers. As we do not have access to the source or the binaries for those services, we cannot distribute them. We can, however, distribute client libraries (mostly thin wrappers over an HTTP client like urllib) and will be doing so. We are currently using a library for remote speech recognition named, aptly, SpeechRecognition. When we have our own speech backend, we intend to extend this library, allowing all consumers of the library to consume Mycroft’s speech services easily, and to also allow Mycroft users/devices to easily switch between speech recognition providers. When faced with hard questions like this, my instinct is to go to the spec (or in this case, the GPLv3 docs on gnu.org), and the spec seems pretty clear on the subject. Please note that link is for the AGPLv3 license, a stricter version of GPLv3 that attempts close the loopholes around client/server boundaries and responsibilities therein.
#iamnotalawyerbutiwatchfranklinandbash

Speech recognition is a hard problem, and we’re looking for any help we can get from the community to drive this initiative forward. Please subscribe to OpenSTT, and point any qualified friends in the right direction! Specifically, people with experience with Kaldi, Spinx, or other ASR toolkits are welcome, as well as anyone with experience feeding temporal signal data (like audio or video) into various deep learning toolkits.

There is a moon, and we are shooting for it.
#killallthemoons

ryanleesipes · December 2, 2015, 8:17am

Thank you for the clarification on these @seanfitz - clear and concise as always. Sending many good feels your way.

Autonomouse · December 2, 2015, 11:52am

At serious risk of being argumentative, aren’t Jira, Confluence, Bitbucket, Bamboo and AWS closed source, 3rd party services? I’m curious why you would recommend those when there are FOSS alternatives available? (Launchpad does all of that, for example, but there are many more)

You could argue (incorrectly, IMHO) that they work better than the FOSS alternatives, perhaps? If that is the case, maybe you would go on to mention that as the Mycroft is a fledgling project and is very much the David in the fight against the Goliaths of Google, Apple, Microsoft, Nuance, etc, and as such, should not compete head-to-head in round one and instead choose it’s battles more cautiously. Would it be foolish for the Mycroft devs to try to do everything all at once and deploy their own stuff, when they could use freely (or cheaply) available 3rd party services, while they concentrated on getting as much FOSS stuff out of the door as they could manage?

I’m putting words in your mouth, and I apologise for that, but I felt that it illustrated the point quite nicely

This is version one. And it’s going to be a commercial product, so it’s got to work - and be competitive - if they expect people to buy and use it.

As long as they release code, such as the OpenSTT project and Adapt and use FOSS wherever viable, then it’s a win for open source, in my opinion. This way, we can chip away at the need for proprietary services, and at some point in the future (probably when small cheap devices like the RPi are powerful enough to host those services locally) a truely FOSS Mycroft (or future Mycroft-like) product will emerge.

Just to reiterate the point to death, if they try to do everything all at once, they will fail and we’ll be back to having nothing. Using 3rd party, proprietary services isn’t ideal, but at least this way they can work on one part at a time while still having a product on sale at the end of it!

Michael_Speth · December 2, 2015, 12:35pm

Jira is not closed source. If you purchase a license you get access to the source code. Its still proprietary though. I am very curious, what is a foss alternative to AWS? When i mean AWS i don’t mean alternatives to ZEN + aws services like OpenStack. I mean a cloud provider where you can provision VMs where its free and open source. Please do enlighten the community of this FOSS cloud service.

Well, the devs here picked jira so not much point in arguing over that with me. Its your fight with them now.

I am also curious of a FOSS remote git repo service. I am not talking about FOSS git repo software but actual web services.

What I am getting at is remote services are not open. You will never know the code that is running on those services even if they claim its FOSS due to security.

Since you are typing here in this forum, the end to end technology are using large amounts of proprietary software, hardware, and systems.

You see the difference in my argument versus your strawman is that my argument is about the claim of this project being open source or not and transparent or not. Your strawman makes the claim that since i questioned this projects open source that i should not be suggesting them use 3rd party web services for developmen bc they are not FOSS.

I hope i have demonstrated that your strawman is ridiculous and that you become a better person on the internet and not use strawman.

Autonomouse · December 2, 2015, 2:02pm

Oh really? That’s good, I didn’t know that.

I was indeed referring to openstack. I use the following Canonical technologies as examples simply because I am familiar with it (although I’m sure you can find others), the use of juju + MAAS allows very straightforward openstack deployments on your own hardware. But this is all besides the point, using AWS sounds like a sensible choice too.

No argument, Jira’s a good choice. I was using it to illustrate the point that sometimes in an imperfect world, a proprietary service can often be a good choice if there isn’t a suitable alternative. Especially, as was my main point, where it’s acting as a crutch while a FOSS alternative is being developed. The BitKeeper/Git for Linux itself example leaps to mind too. Here it looks like the Mycroft devs will be using external 3rd party services, while opening up the parts that they develop themselves. Once those parts are in place, it’s not infeasible that a switch-over can be made, especially if designed to accomodate this from the start. In fact, I seem to remember it being said somewhere (during the kickstarter?) that if you wan’t to run your own server instead of using these you can do. Making the local code independant would facilitate making these 3rd parties services a ‘stepping stone’ towards this goal.

I already mentioned Launchpad.

Launchpad is now open source: Launchpad Blog
Launchpad git hosting: Launchpad Blog

I think there’s gitlab too, although that’s now only “open-core” I think.

It was indeed a strawman and I have already said as much in my original statement:

but you don’t seem to grok my point. I have no problem with using proprietary stuff when necessary, especially if it is a means to an end where that end is replacing as much of it as possible with FOSS. I was using said strawman as an example of how the use of a 3rd party web service is not incompatible with writing a free piece of software. Your original point, if I’m not very much mistaken, was about Mycroft not being 100% open-source. Now, as far as I am aware, the parts of Mycroft that are actually being developed will be open, with the potential to add their own/your own service later on.

I’m not here for an argument (easily said said after vigorously countering each and every point made), let’s instead channel our energies into discussions about what we can do to minimise and eventually replace the proprietary stuff, or make suggestions to improve Mycroft to such a level that it makes so much money it can run it’s own service (along with an open API to mine all of the lovely data it’s collected too hopefully). A discussion that will no doubt be easier when they, y’know, release the code and all (cough, cough @ryanleesipes @seanfitz )

Autonomouse · December 2, 2015, 2:23pm

Oh, maybe I am… never mind then

ryanleesipes · December 2, 2015, 6:46pm

I agree @Autonomouse. We are working tirelessly trying to get everything in our stack open sourced and to create the pieces that don’t already exist. But if we want something that works we have to use what we can (open APIs) while we develop the FOSS alternatives. So the short of it @Michael_Speth is we are working on ensuring the entire stack is FOSS, but it takes time.

Autonomouse · December 2, 2015, 8:12pm

As a thought experiment/fun community project suggestion, I wonder what would be involved in producing your own backend? I appreciate that it would be slow and inaccurate - let’s assume were talking about running this on a PC, not a rasperry pi - but would it just news a case of writing a wrapper around pocket sphinx?

Michael_Speth · December 2, 2015, 9:18pm

I was using it to illustrate the point that sometimes in an imperfect world, a proprietary service can often be a good choice if there isn’t a suitable alternative.

All 3rd party services are proprietary. You totally missed the point and my sarcasm. I know that there are FOSS for cloud technology like OpenStack and various remote git repo services. My question is to point out that THERE IS NO FOSS services. All of these services are locked behind firewalls and certainly permissions. Meaning, you never can get at the code the actual code fuelling these services. And that is OK.

I still think that their Jira instance SHOULD be open and transparent. If they need certain issues to be private (like administration stufff, ssh keys, etc), then they can easily create a separate Jira project that restricts the rights. The very reason they have really pushed the idea that this project is open source is why? Is it to attract attention? If not, then they need to breed a community of contributors to push the project along. This is why their jira instance should for the most part be open for all. I personally would have RO permission on everything except maybe an admin project. Contributors would get RW access but certainly not Admin. This way, developers can become apart of the community and actually contribute.

seanfitz · December 3, 2015, 12:54am

Hey @Autonomouse, as I outlined in this post, getting something that “recognized speech” isn’t really that difficult. The main speech recognition loop uses pocketsphinx to determine whether or not an utterance contains a mycroft “wake phrase”, and only then sends the audio off device for full ASR. As for accuracy, that’s a different story. To be extensible (in terms of number of skills), we really need a dictation recognizer. There are a few claims that pocketsphinx can do this, as well as Sphinx-4, but I haven’t personally had any success with them. We realistically need +>= 85% (magic number, synonym for high) accuracy to make a usable system.

We’ve opened up on some of the tools we’re using (specifically the SpeechRecognition client library). If you wanted to poke around with some of the available tools, we’ll gladly consume your successes

I would start here:
http://www.speech.cs.cmu.edu/sphinx/dictator/

jeffbass · November 9, 2017, 4:46pm

I am new to these forums and am interested in increasing my understanding the continuing efforts to take Mycroft open source. Mycroft is an ecosystem of many software parts. It is also an ecosystem of design philosophy. I like the Mycroft design (what I can see of it so far). I also like the fact that these forums exist so users and Mycroft core team can share discussions that are not directly related to code issues and pull requests.

I have a question: what software is being used to host this forum? I realize others may recognize it, but I don’t. Is this forum software open source?

Wolfgange · November 9, 2017, 4:59pm

Yep! It’s using Discourse which is open source. I had the same question when I first came to the forums.

jeffbass · November 9, 2017, 10:32pm

Thank you. I had not heard of Discourse before, but I’ll take a look at it. It seems to be well suited for forums like this one.

Adam_Monsen · October 2, 2019, 3:06am

Hiya, any updates on open sourcing the whole stack? This thread is almost 4 years old…

Dominik · October 2, 2019, 5:14am

Yes, today the opensource release of the Selene-backend was announced: https://chat.mycroft.ai/community/pl/1jxga71rufgrix1hq1xtpx98ma

But this might be a bit too heavy as it is designed to serve thousands of Mycroft devices. Maybe you have a look at this also: https://github.com/MycroftAI/personal-backend