A sense of A.I. in business

Leap year handling in Python

According to Wikipedia, A leap year (also known as an intercalary year or bissextile year) is a calendar year containing one additional day (or, in the case of lunisolar calendars, a month) added to keep the calendar year synchronized with the astronomical or seasonal year. Because seasons and astronomical events do not repeat in a whole number of days, calendars that have the same number of days in each year drift over time with respect to the event that the year is supposed to track. By inserting (also called intercalating) an additional day or month into the year, the drift can be corrected. A year that is not a leap year is called a common year.

In other words, a normal year has 365 days while a Leap Year has 366 days (the extra day is the 29th of February).

Having been discussing with a few scientists about datetime handling in programming language, we do have a question if the concept of leap year and second is really applied to common programming languages, especially Python. I have been satisfied with Python's strength and capability, just not so sure about how well it deals with the calendar issue like leap year and second. To illustrate whether Python can handle leap year correctly, here's the examples:

In [2]: import datetime

In [3]: datetime.datetime(2011, 2, 28) + datetime.timedelta(days=10)
Out[3]: datetime.datetime(2011, 3, 10, 0, 0)

In [4]: datetime.datetime(2011, 2, 28) + datetime.timedelta(days=1)
Out[4]: datetime.datetime(2011, 3, 1, 0, 0)

In [5]: datetime.datetime(2012, 2, 28) + datetime.timedelta(days=1)
Out[5]: datetime.datetime(2012, 2, 29, 0, 0)

As a highlight, there were only 28 days in February in 2011 while there were 29 days in February in 2012. Sounds good.

So how about datetime object itself?

According to Python library manual, a datetime object is a single object containing all the information from a date object and a time object. Like a date object, datetime assumes the current Gregorian calendar extended in both directions; like a time object, datetime assumes there are exactly 3600*24 seconds in every day.

Okay, so what about Leap Seconds?

According to a post in September 2016 on Stackoverflow, leap seconds are occasionally manually scheduled. Currently, computer clocks have no facility to honour leap seconds; there is no standard to tell them up-front to insert one. Instead, computer clocks periodically re-synch their time keeping via the NTP protocol and adjust automatically after the leap second has been inserted.
Next, computer clocks usually report the time as seconds since the epoch. It'd be up to the datetime module to adjust its accounting when converting that second count to include leap seconds. It doesn't do this at present. time.time() will just report a time count based on the seconds-since-the-epoch.
So, nothing different will happen when the leap second is officially in effect, other than that your computer clock will be 1 second of for a little while.
The issues with datetime only cover representing a leap second timestamp, which it can't. It won't be asked to do so anyway.

Rest assured that Python handled leap year well in the past and hopefully will do it good enough for the ongoing future.

In [6]:  datetime.datetime(2049, 2, 28) + datetime.timedelta(days=1)
Out[6]: datetime.datetime(2049, 3, 1, 0, 0)

In [7]:  datetime.datetime(2050, 2, 28) + datetime.timedelta(days=1)
Out[7]: datetime.datetime(2050, 3, 1, 0, 0)

In [8]:  datetime.datetime(2051, 2, 28) + datetime.timedelta(days=1)
Out[8]: datetime.datetime(2051, 3, 1, 0, 0)

In [9]:  datetime.datetime(2052, 2, 28) + datetime.timedelta(days=1)
Out[9]: datetime.datetime(2052, 2, 29, 0, 0)

Pip install via local proxy server behind secured firewall

Just found that PIP install works so well at home but not in office. One of the restriction is the firewall security to prevent non-http traffic passing through during package installation. Here's a trick:

Changing from this:

$
$ pip install --upgrade git+git://github.com/XXXXX/YYYYY.git

to this:

$ # x.y.z.s is IP address
$ # port is port number
$ pip --proxy=x.y.z.s:port install --upgrade git+https://github.com/XXXXX/YYYYY.git

Most firewall won't block http/https traffic which is supposed to be categorized as web traffic.

Get Plotly offline working in Jupyter Lab

You might encounter blank plot image if you install and use plotly module for Jupyter Lab in the first place.

When plotting data using library like Plotly, you will be asked to create user account and login in order to use the online APIs. However, Plotly does provide an offline version for use. It takes a couple of steps to resolve this manually.

This solutions work on Windows platform, but may also work on Linux/MacOS platform. Mileage varies.

To make sure plotly offline working on Jupyter Lab, please try the followings:

Try install plotly extension for Jupyter Lab:

> jupyter labextension install @jupyterlab/plotly-extension

For details, please visit https://github.com/jupyterlab/jupyter-renderers

You might also encounter ETIMEOUT error while install labextension if your computer is behind a known proxy server. Here's how to resolve:

>
> npm config set http-proxy <proxy address: port>
> npm config set https-proxy <proxy address: port>

For the issue of Plotly chart output from big dataset, the key is to increase maximum rate for output stream on Jupyter Lab server.
Edit the following entry in configuration file C:\Users\%USERNAME%\.jupyter\jupyter_notebook_config.py:
c.NotebookApp.iopub_data_rate_limit = 1.0e10

Here's an Plotly offline sample code block tested to be running on Jupyter Lab v0.31.12:

from plotly import __version__
import plotly
from plotly.offline import init_notebook_mode, plot
from plotly.graph_objs import Scatter

init_notebook_mode()

print("plotly version:", __version__)
plotly.offline.iplot([Scatter(x=[1, 2, 3], y=[3, 1, 6])])

Conversion from Magnetic North to True North

Here's the web site to acquire the True North calculation based on the coordinate and date:

http://www.ga.gov.au/oracle/geomag/agrfform.jsp

Quick recap of single line command to replace strings in heaps of files

I was looking at the fast way to replace strings in all related Python script files recursively at Terminal.

Here's the stand one:

$ find . -type f -name "*.py" -print | xargs sed -i 's/foo/bar/g'

xargs will combine the single line output of find and run commands with multiple
arguments, multiple times if necessary to avoid the max chars per line limit. In this case we combine xargs with sed.

Here's a variation:

$ find *.py -type f -exec sed -i "s/foo/bar/g" {} \;

This one is a bit different but may be easier to remember. It actually uses find command to output a list of files. With each one of the line, it then substitutes the filename with {} for the command line using sed for further processing which is replacing 'bar' with 'foo'. A character ';' is appended to the end of each line of command.

With this command, it would actually produce something like these:

$ sed -i "s/foo/bar/g" script1_found.py;

$ sed -i "s/foo/bar/g" script2_found.py;

$ sed -i "s/foo/bar/g" script3_found.py;

$ sed -i "s/foo/bar/g" script4_found.py;

$ sed -i "s/foo/bar/g" script5_found.py;

$ sed -i "s/foo/bar/g" script6_found.py;

...

WRF & ARW

What is WRF?

WRF is the short form of Weather Research and Forecasting Model, i.e., a numerical weather prediction system. WRF is a state-of-the-art atmospheric modeling system designed for both meteorological research and numerical weather prediction. It offers a host of options for atmospheric processes and can run on a variety of computing platforms.

Used for both research and operational forecasting
It is a supported "community model", i.e. a free and shared resource with distributed development and centralized support
Its development is led by NCAR, NOAA/ESRL and NOAA/NCEP/EMC with partnerships at AFWA, FAA, DOE/PNNL and collaborations with universities and other government agencies in the US and overseas

WRF Community Model

Version 1.0 WRF was released December 2000
Version 2.0: May 2004 (add nesting)
Version 3.0: April 2008 (add global ARW version)
... (major releases in April, minor releases in summer)
Version 3.8: April 2016
Version 3.8.1: August 2016
Version 3.9: April 2017
Version 3.9.1(.1) (August 2017)

What is ARW?

WRF has two dynamical cores: The Advanced Research WRF (ARW) and Non-hydrostatic Mesoscale Model (NMM)

Dynamical core includes mostly advection, pressure-gradients, Coriolis, buoyancy, filters, diffusion, and time-stepping

Both are Eulerian mass dynamical cores with terrain-following vertical coordinates

ARW support and development are centered at NCAR/MMM

NMM development is centered at NCEP/EMC and support is provided by NCAR/DTC (operationally now only used for HWRF)

Usage of WRF

ARW and NMM

Atmospheric physics/parameterization research
Case-study research
Real-time NWP and forecast system research
Data assimilation research
Teaching dynamics and NWP

ARW only

Regional climate and seasonal time-scale research
Coupled-chemistry applications
Global simulations
Idealized simulations at many scales (e.g. convection, baroclinic waves, large eddy simulations)

Examples of WRF Forecast

Hurricane Katrina (August, 2005): Moving 4 km nest in a 12 km outer domain
US Convective System (June, 2005): Single 4 km central US domain

Real-Data Applications

Numerical weather prediction
Meteorological case studies
Regional climate
Applications: air quality, wind energy, hydrology, etc.

Ref: https://www.climatescience.org.au/sites/default/files/WRF_Overview_Dudhia_3.9.pdf
Ref: http://www2.mmm.ucar.edu/wrf/users/

CALPUFF

CALPUFF is an advanced, integrated Lagrangian puff modeling system for the simulation of atmospheric pollution dispersion distributed by the Atmospheric Studies Group at TRC Solutions.

It is maintained by the model developers and distributed by TRC. The model has been adopted by the United States Environmental Protection Agency (EPA) in its Guideline on Air Quality Models as a preferred model for assessing long range transport of pollutants and their impacts on Federal Class I areas and on a case-by-case basis for certain near-field applications involving complex meteorological conditions.

The integrated modeling system consists of three main components and a set of preprocessing and postprocessing programs. The main components of the modeling system are CALMET (a diagnostic 3-dimensional meteorological model), CALPUFF (an air quality dispersion model), and CALPOST (a postprocessing package). Each of these programs has a graphical user interface (GUI). In addition to these components, there are numerous other processors that may be used to prepare geophysical (land use and terrain) data in many standard formats, meteorological data (surface, upper air, precipitation, and buoy data), and interfaces to other models such as the Penn State/NCAR Mesoscale Model (MM5), the National Centers for Environmental Prediction (NCEP) Eta model and the RAMS meteorological model.

The CALPUFF model is designed to simulate the dispersion of buoyant, puff or continuous point and area pollution sources as well as the dispersion of buoyant, continuous line sources. The model also includes algorithms for handling the effect of downwash by nearby buildings in the path of the pollution plumes.