Sunday, March 5, 2017

Python and R for code development

The previous post glossed about why I now prefer Python to write code, including for a module like logopt. This post explains in more details some specific differences where I prefer one of these two languages:
  • 0-based indexing in python versus 1-based indexing in R.  This may seem a small difference but for me, 0-based indexing is more natural and results in less off by one errors.  No less than Dijkstra opines with me on 0-based indexing.
  • = versus <- for assignment.  I like R approach here, and I would like to see more languages doing the same.  I still sometimes end up using = where I wanted ==.  If only R would allow <- in call arguments.
  • CRAN versus pypi
    • CRAN is much better for the user, the CRAN Task Views is a gold mine, and in general CRAN is a better repository, with higher quality packages.
    • But publishing one CRAN is simply daunting, and the reason logopt remained in R-Forge only.  The manual explaining how to write extensions is 178 pages long.
  • Python has better data structures, especially the Python dictionary is something I miss whenever I write in R.  Python has no native dataframe, but this is easily taken care of by importing pandas.
  • Object orientation is conceptually clean and almost easy to use in Python, less so in R.
  • Plotting is better in R.  There are some effort to make Python better in that area, especially for ease of use.  Matplotlib is powerful but difficult to master.
  • lm is a gem in R, the simplicity with which you can express the expressions you want to model is incredible
All in all, I prefer coding in Python.  This is a personal opinion of course, and R remains important because of some packages, but for more general purpose tasks, Python is simpler to use, and that translates in being more productive. 

13 comments:

  1. You know you're just igniting a flame war nearly as long-lasting as "emacs vs. vi," right? FWIW, I avoid python largely because of the white-space-as-delimiter model. It's not only hard to see, but copy/paste operations often replace with N , which makes the code grouchy and the programmer sad.

    ReplyDelete
    Replies
    1. oops - it appears blogspot treats anything inside \<\> as html tags. It should have read "...replace TAB with N*SPACE

      Delete
  2. I don't see Python versus R as a dichotomy, I still use both myself, and the exact choice should always be based on personal preference. So no flame war please :-).

    The white space as delimiter seems to indeed *deeply* bother some people, personally it (almost) never has been an issue. Basically you want correct indenting anyhow for human readability so why not make sure that proper indenting is mandatory because it becomes syntax. You can configure your editor or IDE to do the right thing, i.e. never use TAB and that takes care of most cut and paste problems.

    There is one specific case where it is a problem: when you want to autogenerate code or do some other automated code processing. It is much simpler to so with explicit block delimiters. Most people will not hit that corner, but it can be a little painful.

    ReplyDelete
  3. devtools::check() without notes and warnings and you can submit to CRAN. It does not get any easier than that.

    Isn't Python dictionary a named list in R?

    0-based indexing is really a matter of taste. I personally hate it, because I had a misfortune of being born with fingers which are not numbered from zero. But I have succesfully adjusted and it is just one of those things.

    ReplyDelete
  4. Good to have these discussions,

    FWIW, R package development is much easier if you follow Hadley Wickham's R Packages book. It is designed to give you best practices and the devtools package makes developing R packages much much easier - the book is free as well! http://r-pkgs.had.co.nz/

    ReplyDelete
  5. odoo customizationERP for manufacturing, ecommerce and open source erp: are you looking for odoo implementation? Then choose us. We are experienced, powered by superior technical.

    ReplyDelete