Lessons Learned from Establishing an Internal Knowledge Base

If you’re unfamiliar with the concept, a knowledge base is a collection of easily-searchable and well-organized documents to which anyone in the company can contribute. The purpose is to document institutional knowledge in a centralized place to which everyone has access.

At companies I’ve previously worked at, the internal wiki has been immensely useful for me. It was the first place I went to if I ran into any issues, technical or non-technical. If an answer didn’t exist in the wiki, I’d write a page myself. I would also create pages for myself and add my daily notes in the hope that it might help me or someone else someday. And it often did.

At Agolo, I advocated to start a knowledge base for our company. After a lot of investigation and deliberation, we decided to choose DokuWiki. I’m learning a lot from the process.

This blog post is an attempt to organize and distill the options we looked at before settling on DokuWiki. Hopefully, it helps to guide you through a similar process at your organization. In the next post, I will write about the challenges and lessons learned from using our DokuWiki.

The following are all the options we considered when picking a knowledge base software for our startup.

1. Wiki

I heavily advocated for this tool because of my previous experience with it.

We considered a number of alternative wiki offerings. First, we looked for a hosted solution that was preferably free. We looked at Wikia, but it didn’t offer private, internal wikis.

We looked at Gollum, and possibly creating an empty GitHub repository just for its wiki, but decided against it. It doesn’t have a large enough community or a library of plugins to choose from. Also, it doesn’t have the ability to tag pages as far as I could tell. But it was a very strong contender.

We also considered Confluence, but decided it was too complex and too expensive. We don’t use JIRA or any other Atlassian products, so we felt we wouldn’t be making full use of it for the amount of money it costs. If not for the cost, Confluence has everything we were looking for.

How.dy’s Slack Wiki looks ideal from what we could tell. It uses Markdown, it has a simple and modern UI, and it’s free. It has Slack integration and uses Slack for authentication. It also uses flat text files as its storage engine. It might not have search, so that was one drawback. The main deterrent to using it, however, was that it is not available to the public. The blog post says that it will be opened up, but I could not find a follow-up post that announces its availability.

And last but not least, we investigated using MediaWiki, which is what Wikipedia is built on. This is a very full-featured wiki software. It’s tried-and-true in its usage at Wikipedia. Most people are very familiar with its UI. The reason we didn’t choose it is because it is too powerful for our needs. We wanted our wiki to be lightweight, easy to install, and not too overwhelming. In addition, we wanted to avoid using a full-fledged database for the storage engine. So, MediaWiki was out.

2. Gitbook

One of my coworkers recently used Gitbook to write a book, and he was very enthusiastic about it. So, we considered having our knowledge base in Gitbook format as well. The idea was each page of the knowledge base would be its own chapter, or sub-chapter, or sub-sub-chapter, depending on where it fits in a global hierarchy of documents.

The advantages would be that it uses Github-flavored Markdown, which all of us are already pretty familiar with. It has a modern look-and-feel, which most wiki offerings do not. It also forces us to think in a hierarchical structure when creating pages.

However, I felt that it had too many drawbacks. I’d say that the forced hierarchical nature is a drawback in itself. Knowledge bases should have as little friction as possible for page creation. If creating a page meant thinking about where it fits in the global structure of the knowledge base, page creation would become less frequent.

In addition, moving from read mode to write/edit mode feels very sluggish — this transition has clearly not been optimized. When writing a book, an author doesn’t often switch between editing and reading. So, this makes sense for that use case. A knowledge base should be much more permissive of switching from reading to editing. This would encourage more participation.

Another disadvantage is that in my mind, a knowledge base should be littered with links to other pages in the knowledge base. In Gitbooks, linking to another chapter is not a frequent use case. This is because the paradigm it’s based on is books, which are read linearly from one chapter to the next.

Another big reason why I advocated against Gitbooks is because it is not scalable. Having one chapter per document might work in the first few months. However, as our company grows, so will our knowledge base. Having 500 chapters would become cumbersome if every new chapter had to fit into an existing hierarchy. Also, the list of chapters would become totally useless.

And finally, having a private Gitbook costs $7 per month.

So, we decided not to use Gitbook for our company knowledge base.

3. Evernote

Some of us are already heavy users of Evernote. So, we considered just creating a notebook where all of us would keep adding notes.

We already use Evernote as a collaborative tool for some specific purposes. For example, we store meeting notes in Evernote notebooks. This is made especially easy because some of us use the Scannable app to take a photo of a hand-written page of notes and make it searchable in Evernote. This is a huge advantage.

Another advantage of using Evernote is the extremely low friction of creating and editing notes. There is no notion of edit mode vs. read-only mode, so users will be encouraged to edit whichever page they’re reading. This is a highly desirable effect in a knowledge base. Also, because it is an application, it is extremely mobile-friendly in addition to being highly performant on our laptops.

However, I felt there is an inherent lack of structure when it comes to dumping everything into Evernote. This is on the opposite end of the spectrum from Gitbook. To me, it would cause more problems by making it too easy to create new pages — it would become more difficult to find old pages.

In addition, having every document in the knowledge base live under the same notebook also seems problematic. If it were possible to have a shareable hierarchy of notebooks, Evernote would be more viable in my eyes. Unfortunately, Stacks are not shareable. It would be easier to organize the knowledge base into smaller categories.

So, while Evernote is a strong contender, we decided against it.

4. Pivotal Bookbinder

We weren’t too familiar with the principles behind Bookbinder, and the documentation doesn’t seem to be detailed enough for us to get acquainted. In addition, the setup seemed like a barrier to entry to us.

In addition, we could not tell if this software supported the tagging and categorization of pages.

5. Sharepoint

In addition to being pricey, Sharepoint is also too heavyweight for our needs. We aren’t extensive users of the MS ecosystem. In addition, it seems too complex for our needs with a steep learning curve.

While Sharepoint also has an option to create a Wiki, it does not support Markdown. So, we did not want to make the commitment to using Sharepoint.

My Experience at TechCrunch Disrupt Hackathon 2016

Last weekend, I participated in the TechCrunch Disrupt Hackathon in New York City. Here’s my demo.

Screenshot 2016-05-10 21.19.25.png

The story of how I got on that stage with that project is slightly more complicated.

I originally went to the hackathon as part of a team: me, Tom, Lowell, Shabnum, and Scott.

The hackathon took place at the Brooklyn Cruise Terminal, a very industrial-looking place.

We were one of the first teams to arrive, so we got to pick a good table.

Our project was in EdTech, and we called it Mindset. Scott has written about it here. My job was to set up and implement the Natural Language Processing backend server and its API endpoint. The application would send the server a syllabus, and the server would parse it into topics, tag each topic, extract dates and deadlines, and return a nice data structure with all of this information. Its topic extraction would be powered by IBM Watson’s Concept Insights API.

The hackathon began at around 1:30 PM on Saturday and the deadline to submit projects was 9:30 AM on Sunday. We worked on it without facing any real problems all through Saturday afternoon and into Saturday night.

The hackathon had tables for 89 teams total, plus a number of booths for sponsors.

Soon enough, it was past midnight. We were starting to get tired but we were fueled by three things: teamwork, our goal to complete the project, and caffeinated beverages.


Still going strong at 2:15 AM.

Before we knew it, 4 AM rolled around. Some of my teammates went home to take naps or freshen up. Some found places to curl up. Regardless, we were driven by a singular purpose: submit our project before the 9:30 AM deadline and wow the judges at the 60-second demo.

Finally, at around 5:30 AM, we had completed what we’d set out to do! Our website was up and running, my NLP server was making calls to IBM Watson and interpreting the results correctly, and our backend server was fully functional and robust.

My team started prepping for the demo. I didn’t need to be involved, so I was left to my own devices. I was wide awake at this point, and I had around 4 hours to burn, so I decided to do some work on a project I’d been thinking about for a while.

I had been planning to make a Twitter bot that uses Agolo‘s API to summarize the contents of any URL you tweet at it. This would be a follow-up to a similar Slack bot that I created a few weeks ago. I thought to myself, what better time to get started on this project than 6 AM at the TechCrunch hackathon after having stayed up all night?

I got to work on it. I picked Python because I have some experience working with Tweepy, a Twitter library. I knew that I had to circumvent the 140-character limit somehow, so I had the idea to use images to display the summary. I used the Python Imaging Library (PIL) for that.

I set up the Twitter account, got my code running on my AWS server, and started testing it. I had to make a number of tweaks to the way I was using PIL in order to make the text look good enough to demo. PIL doesn’t automatically do word wrap, so I had to find a way to insert newlines into the text where it made sense.

Finally, with around 20 minutes left until the deadline, I hacked together a working Python script that could achieve my project’s goal!

I submitted it, deathly-tired, forgetting one important detail: submitting a project means I have to give a demo onstage. I was about to fall asleep, but this realization was a shot of adrenaline that kept me awake.

It was time for the demos to start.


The auditorium from halfway back.

I sat in the audience and mentally prepared some things to say at my demo. I checked, double-checked, and triple checked that my project was working.

Meanwhile, my teammates Tom and Shabnum went up to present Mindset. They did a wonderful job despite the technical difficulties they faced. I was proud to see the end result of a long night of hard work being presented up onstage.


Tom and Shabnum setting up the laptop for their demo.

They called up the next batch of presenters to wait backstage. There was a 10-minute break during this period, so I got to practice my speech a little and meet some of the other presenters.

Backstage at the control booth. The black wall with the green lights at the top is the stage’s backdrop.

Finally, I was next in line to go onstage, set up my laptop, and wait for the previous presenter to finish.

Photos taken just offstage. I was balancing my open laptop in one arm as I took these pictures. I probably should not have taken this risk.

Then, I presented. I don’t remember most of it. The lack of sleep, combined with the adrenaline, put me in a state where I was giving an impassioned presentation of my project instead of paying attention to the hundreds of faces looking at me from the audience.

I walked offstage and back to my seat. I came down from the rush, and my tiredness finally took over. It took a lot of effort to finish watching the rest of the presentations and the awards ceremony. Then, I finally stepped outside for the first time in many hours.

Finally, sunlight and fresh air. Well, as fresh as it gets in NYC.

It was sunny for the first time in a week. It was a strange feeling to finally feel the sun on my skin after many days, punctuated by an intense experience like that.

I somehow made my way home.

Then, I slept for 14 hours.

All in all, it was a really fun experience. It was like a marathon, but with my team to keep it light and make it enjoyable. My 6 AM decision to start working on my own project turned out well, but I wasn’t in my right mind when I made that choice. However, sleep-deprived-me chose to take a big risk instead of playing it safe, and that’s a lesson I can learn from him. My main takeaway is to challenge myself and push my boundaries whenever possible, because the reward is often underestimated and risk is often overestimated.