• C# or Java? TypeScript or JavaScript? Machine learning based classification of programming languages

      GitHub hosts over 300 programming languages—from commonly used languages such as Python, Java, and Javascript to esoteric languages such as Befunge, only known to very small communities.


      Figure 1: Top 10 programming languages hosted by GitHub by repository count 

      One of the necessary challenges that GitHub faces is to be able to recognize these different languages. When some code is pushed to a repository, it’s important to recognize the type of code that was added for the purposes of search, security vulnerability alerting, and syntax highlighting—and to show the repository’s content distribution to users.

      Linguist is the tool we currently use to detect coding languages at GitHub. Linguist a Ruby-based application that uses various strategies for language detection, leveraging naming conventions and file extensions and also taking into account Vim or Emacs modelines, as well as the content at the top of the file (shebang). Linguist handles language disambiguation via heuristics and, failing that, via a Naive Bayes classifier trained on a small sample of data. 

      Although Linguist does a good job making file-level language predictions (84% accuracy), its performance declines considerably when files use unexpected naming conventions and, crucially, when a file extension is not provided. This renders Linguist unsuitable for content such as GitHub Gists or code snippets within README’s, issues, and pull requests.

      In order to make language detection more robust and maintainable in the long run, we developed a machine learning classifier named OctoLingua based on an Artificial Neural Network (ANN) architecture which can handle language predictions in tricky scenarios. The current version of the model is able to make predictions for the top 50 languages hosted by GitHub and surpasses Linguist in accuracy and performance.
      Read more →
    • How do you choose products in stores?

      • Translation
      image

      The most important single ingredient in the formula of success is knowing how to get along with people. Theodore Roosevelt

      In the previous article I tried to cover the basics of pricing analytics. Now I'd like to talk about something more interesting.

      Have you ever thought about why you choose certain products in stores, why you prefer them to other similar ones? Many shopping trips are spontaneous, so it's probably impossible to give a clear answer for all the times you go shopping. But the general idea is obvious: you go shopping for a specific reason (to get food, a gadget, for entertainment, to play blackjack). In this article I'm going to use available data from grocery retailers to talk about how a set of basic logical assumptions and community analysis can help us determine the way customers choose products.
      Read more →
    • Most Popular Computer Problems We Are Facing Everyday

        In today’s contemporary world the private computer has become a staple of daily life.
        Even those few persons who don't use computers in their daily work life can possibly have access to a computer on that they perform alternative necessary tasks.

        With all of the access to info that computers permit and with all of the work they will facilitate a personal perform, this trend of a computer in every home and in each workplace of business isn't shocking.

        But what may be shocking, and downright aggravating, is when the computer you are working on suddenly shuts off, goes blank, or explodes in the dreaded blue screen of death.

        These and other common computer problems are among the most frequently occurring issues that one might experience with their computers.

        The following may be a list of 5 common Computer issues and what may be done to mend them.
        Read more →
      • How AI, drones and cameras are keeping our roads and bridges safe

          «It's a dangerous business, Frodo, going out your door. You step onto the road, and if you don't keep your feet, there's no knowing where you might be swept off to.»

          ― J.R.R. Tolkien, The Lord of the Rings

          Europe’s roads are the safest in the world. Current figures show that there are 50 fatalities per one million inhabitants, compared to the global figure of 174 deaths per million. Despite this, each loss remains a tragedy. In 2017, 25,300 people lost their lives on European roads.

          The cause of these accidents can vary from human error and weather conditions, to damaged structures and surfaces. While some things are beyond the realms of control, road and bridge conditions are a variable which can be governed.

          As soon as a road is paved, a combination of traffic and weather conditions begin to degrade and erode the surface. Undetected cracks, abrasions or defects can quickly lead to bigger problems, such as costly repairs, major traffic delays, and in the worst cases, unsafe condition. These problems are also shared by bridges, particularly when concrete is critical in maintaining the integrity of the structure. The earlier faults are detected, the sooner they can be addressed, saving time and money, while minimising disruption. Ultimately, this helps ensure that the roads themselves are safer for those travelling on them.

          The detection of these faults, however, can be very difficult to carry out manually, especially as early-forming cracks are hard to spot with the naked eye. Predicting where faults are likely to occur ahead of time so that appropriate measures can be taken in advance also possess a massive challenge. Thankfully, technology is here to help.

          Read more →
        • The big interview with Martin Kleppmann: “Figuring out the future of distributed data systems”



            Dr. Martin Kleppmann is a researcher in distributed systems at the University of Cambridge, and the author of the highly acclaimed «Designing Data-Intensive Applications» (O'Reilly Media, 2017). 

            Kevin Scott, CTO at Microsoft once said: «This book should be required reading for software engineers. Designing Data-Intensive Applications is a rare resource that connects theory and practice to help developers make smart decisions as they design and implement data infrastructure and systems.»

            Martin’s main research interests include collaboration software, CRDTs, and formal verification of distributed algorithms. Previously he was a software engineer and an entrepreneur at several Internet companies including LinkedIn and Rapportive, where he worked on large-scale data infrastructure.

            Vadim Tsesko (@incubos) is a lead software engineer at Odnoklassniki who works in Core Platform team. Vadim’s scientific and engineering interests include distributed systems, data warehouses and verification of software systems.

            Contents:


            • Moving from business to academic research;
            • Discussion of «Designing Data-Intensive Applications»;
            • Common sense against artificial hype and aggressive marketing;
            • Pitfalls of CAP theorem and other industry mistakes;
            • Benefits of decentralization;
            • Blockchains, Dat, IPFS, Filecoin, WebRTC;
            • New CRDTs. Formal verification with Isabelle;
            • Event sourcing. Low level approach. XA transactions; 
            • Apache Kafka, PostgreSQL, Memcached, Redis, Elasticsearch;
            • How to apply all that tools to real life;
            • Expected target audience of Martin’s talks and the Hydra conference.

            Read more →
          • Airbus reaches new heights with the help of Microsoft mixed reality technology

              It took Airbus 40 years to build its first 10,000 aircraft. Over the next 20 years, the aerospace giant aims to build 20,000 more—aformidable challenge that will require cutting-edge innovation.


              Holographic technology from Microsoft, known as “mixed reality” because it combines physical and digital worlds, will be key to helping Airbus reach this ambitious goal.


              An aerial view of Airbus jets.
              Read more →
            • How to save $58 in 5 minutes: let’s use different prices in each country against marketers

                image

                Hello Habr! Now that is summer vacation season. Many of you will take a flight to a place far from your everyday routine at home. Before this hot vacation season starts, we should discuss an interesting and useful method on how to save money using a VPN.

                One of the easiest ways to see the value in this is looking at car rentals while on vacation.
                Read more →
              • A drawing bot for realizing everyday scenes and even stories

                  Drawing bot


                  If you were asked to draw a picture of several people in ski gear, standing in the snow, chances are you’d start with an outline of three or four people reasonably positioned in the center of the canvas, then sketch in the skis under their feet. Though it was not specified, you might decide to add a backpack to each of the skiers to jibe with expectations of what skiers would be sporting. Finally, you’d carefully fill in the details, perhaps painting their clothes blue, scarves pink, all against a white background, rendering these people more realistic and ensuring that their surroundings match the description. Finally, to make the scene more vivid, you might even sketch in some brown stones protruding through the snow to suggest that these skiers are in the mountains.


                  Now there’s a bot that can do all that.

                  Read more →
                • Launching a taxi-hailing app in Tokyo: How Sony does it with S.Ride?

                    image

                    Uber, as we know, operates only in 650 cities and remains the best among all taxi apps. But have you ever imagined about other cities and their demand for taxi applications? If you did, you would have certainly come across a few regional apps like Ola, Didi Chuxing, Japan Taxi, etc. These apps are focused on fulfilling the demands of locals; and in that way, they have succeeded and generated revenue tremendously. If you search for the reason behind the success of these apps, it inevitably ends up in the kind of service it provides its customers. So, it all depends on how well you bestow your service (whether you focus regionally or globally).
                    Read more →
                  • How Moovit improved its app to help people with disabilities ride transit with confidence

                      Alexandr Epaneshnikov, a 19-year-old Russian student who is legally blind, recently decided he wanted to be more independent by commuting on his own and relying less on his mom for rides to school. It meant taking a streetcar to a subway to his high school in Moscow, a 30-minute trip that Epaneshnikov assuredly navigates with a cane and Moovit, an urban mobility app optimized for screen readers.


                      Read more →
                    • Bluetooth stack modifications to improve audio quality on headphones without AAC, aptX, or LDAC codecs

                        Before reading this article, it is recommended to read the previous one: Audio over Bluetooth: most detailed information about profiles, codecs, and devices / по-русски

                        Some wireless headphone users note low sound quality and lack of high frequencies when using the standard Bluetooth SBC codec, which is supported by all headphones and other Bluetooth audio devices. A common recommendation to get better sound quality is to buy devices and headphones with aptX or LDAC codecs support. These codecs require licensing fees, that's why devices with them are more expensive.

                        It turns out that the low quality of SBC is caused by artificial limitations of all current Bluetooth stacks and headphones' configuration, and this limitation can be circumvented on any existing device with software modification only.
                        Read more →
                      • Audio over Bluetooth: most detailed information about profiles, codecs, and devices

                          XKCD comic. How standards proliferate. SITUATION: there are 14 competing standards. Geek: 14?! Ridiculous! We need to develop one universal standard that covery everyone's use cases. Geek's girlfriend: yeah! SOON: Situation: there are 15 competing standards.

                          This article is also available in Russian / Эта статья также доступна на русском языке

                          The mass market of smartphones without the 3.5 mm audio jack changed headphones industry, wireless Bluetooth headphones have become the main way to listen to music and communicate in headset mode for many users.
                          Bluetooth device manufacturers rarely disclose detailed product specifications, and Bluetooth audio articles on the Internet are contradictory and sometimes incorrect. They do not tell about all the features, and often publish the same false information.
                          Let's try to understand the protocol, the capabilities of Bluetooth stacks, headphones and speakers, Bluetooth codecs for music and speech, find out what affects the quality of the transmitted audio and the delay, learn how to capture and decode information about supported codecs and other device features.

                          TL;DR:

                          • SBC codec is OK
                          • Headphones have their own per-codec equalizer and post processing configuration
                          • aptX is not as good as the advertisements say
                          • LDAC is a marketing fluff
                          • Voice audio quality is still low
                          • Browsers are able to execute audio encoders compiled to WebAssembly from C using emscripten, and they won't even lag.

                          Read more →
                          • +22
                          • 10.3k
                          • 6
                        • The Data Structures of the Plasma Cash Blockchain's State

                          • Tutorial


                          Hello, dear Habr users! This article is about Web 3.0 — the decentralized Internet. Web 3.0 introduces the concept of decentralization as the foundation of the modern Internet. Many computer systems and networks require security and decentralization features to meet their needs. A distributed registry using blockchain technology provides efficient solutions for decentralization.
                          Read more →
                        • The one who resurrected Duke Nukem: interview with Randy Pitchford, magician from Gearbox

                            RUVDS and Habr continue the series of interviews with interesting people in IT field. Last time we talked to Richard «Levelord» Gray, level designer of popular games Duke Nukem, American McGee’s Alice, Heavy Metal F.A.K.K.2, SiN, Serious Sam, author of well-known «You’re not supposed to be here» phrase.

                            Today we welcome Randall Steward «Randy» Pitchford II, president, CEO and co-founder of Gearbox Software video game development company.

                            Randy started in 3D Realms where contributed to Duke Nukem 3D Atomic Edition and Shadow Warrior. Then he founded Gearbox Software and made Half-Life: Opposing Force, which won D.I.C.E in 2000. Other Gearbox titles include Half-Life: Blue Shift, Half-Life: Decay, Counter-Strike: Condition Zero, James Bond 007: Nightfire, Tony Hawk's Pro Skater 3, Halo: Combat Evolved and of course Borderlands.

                            The interview team also includes editor of Habr Nikolay Zemlyanskiy, Richard «Levelord» Gray, Randy’s wife Kristy Pitchford and Randy’s son Randy Jr.


                            Read more →
                          • How to Choose the Best Project Management Tool If You Are a Millennial?

                              The role of project management is becoming more and more relevant no matter in which area or industry it is implemented. In fact, it balances all project processes and steps and helps project teams meet their goals and objectives.

                              Project management helps to reach goals faster, cheaper and avoid risks thereby contributing greatly to business strategy execution. More companies cannot imagine their performance success without project management as one of the key business competencies. Since this competence is actively developing, professional project management software is also evolving with it in the same way.

                              image
                              Read more →
                            • Must-Have Mobile Application Animations

                                The animation is at the heart of mobile app User experience (UX). Truth be told, energized changes quietly impart an assortment of messages and show the client the best way to explore through the mobile app by just coordinating the client's consideration.

                                For instance, liveliness can signify a connection between shared components. They can likewise be fundamental in demonstrating the progression between two states or direct the client's thoughtfulness regarding a suggestion to take action catch.

                                Such powerful development in mobile apps is dependably a utilitarian segment as opposed to adornments like structure components. Thus, movement in the UX configuration ought to be approached from the earliest starting point when the group designs the client's voyage.

                                At the point when the impact and ease of use of enlivened components are broken down during the QA testing stage, these changes can likewise be wiped out on the off chance that they neglect to convey a positive effect.

                                Vivified connections can make consistent cooperative energy between screens or realize a snapshot of progress. For instance, these activities can be any or the majority of the accompanying:

                                • Check the container
                                • Explore to another page
                                • Open settings
                                • Give framework status
                                • Communicate something specific

                                The above is only a glimpse of something larger. So as we quickly approach the finish of 2018, we should investigate the three most blazing mobile app movement inclines that are upgrading UX no matter how you look at it.
                                Read more →
                              • What are the application areas of 3D printing?

                                What is 3D printing?


                                3D printing is a new way of manufacturing solid objects based on the principle of discrete-stacking.
                                With the evolution of the trend, 3D printing has become a way to promote smart manufacturing, flexible manufacturing and green manufacturing. It can realize the integrated formation of complex structures that are difficult to process or even cannot be processed by traditional manufacturing technologies, greatly enhancing the process realization capability. Subversive advancement in equipment design and manufacturing.

                                From the past used to manufacture models, and now gradually realize the direct manufacturing of products, 3D printing manufacturing technology is developing towards the integration of “design-material-manufacturing”. According to research, the size of the world 3D printing market in 2017 was 3.86 billion US dollars. In 2018, the scale of China's 3D printing market was as high as 7.75 billion US dollars, which has more than doubled in two years. From behind the scenes to the forefront of the industry, the development of the 3D printing industry has moved from the concept introduction period to the rapid development period, all thanks to: "application." Members of the World 3D Printing Association have said:

                                When 3D printing technology goes out of the lab,
                                Its development motivation lies in its application
                                Read more →
                              • 5 Robust Prioritization Techniques for IT Teams

                                  Is it always easy for you to prioritize the tasks of the huge project? What if five or more tasks have the main priority and urgency?

                                  Experienced project managers and product owners know that intuition is not enough in such cases. In order to avoid missing deadlines, today, managers are able to apply useful methodologies for determining priorities, as well as modern tools that help to visualize data and not miss anything in their workflows.

                                  image
                                  Read more →