Deep Learning: It's about information bottleneck

A theory comes out to demystify Deep Learning. This seems to explain how Deep Learning works behind the scene. The graph shows the progress of how a deep neural network evolves during various stages.

Ref: https://www.quantamagazine.org/new-theory-cracks-open-the-black-box-of-deep-learning-20170921/

Photos app on MacOS Sierra causes quitting unexpectedly upon opening

Problem came one after another on my old MacBook which showed a sign of data corruption for Photos app. First thing first, it was still on macOS Sierra but Photos app might have gone through recent update.

A message popped up when opening Photos app and required to have a repair on Photos' library since a recovery from time-machine backup. I wouldn't mind if it could fix it so I tried. After the repair operation, Photos app could no longer be opened again. Along with the crash, an error message said 'Photos quits unexpectedly'.

After reading a post about similar thing, it seems to be an issue of unmatched dynamic library causing Photos' crash.

/Library/Caches/com.apple.xbs/Sources/PhotoApp_SubFrameworks/PhotoApp-3161.4.14 0/lib/photolibrary/PhotoLibraryPrivate/PhotoLibraryPrivate.m:23
One explanation:

The cached version of a dynamic library is newer than the installed version. Try first to start your Mac in Safe mode to clear any caches, try to run Photos once in Safe Mode,see: Use safe mode to isolate issues with your Mac - Apple Support
To tackle compatibility issue of Photos app, two options are available:

1. Clear library cache in safe mode in the hope that Photos app will run again for further backup. But it looks like a temporary solution. I got a feeling that I might need to upgrade to macOS High Sierra for a remedy if there is something related newer Photos app.

2. Reinstall MacOS on top of running system (assuming the same version) in order to reinstall Photos app (as one of the builtin apps). Make sure a full time-machine backup is stored somewhere for recovery.

For option 1, it needs to clarify what is safe mode:

Safe mode (sometimes called safe boot) is a way to start up your Mac so that it performs certain checks and prevents some software from automatically loading or opening. Starting your Mac in safe mode does the following:
    Verifies your startup disk and attempts to repair directory issues, if needed
    Loads only required kernel extensions
    Prevents startup items and login items from opening automatically
    Disables user-installed fonts
    Deletes font caches, kernel cache, and other system cache files
If your Mac has an issue that goes away when you start up in safe mode, you might be able to isolate the cause.
So here's how to start up in safe mode:

  1. Start or restart your Mac, then immediately press and hold the Shift key. The Apple logo appears on your display. If you don't see the Apple logo, learn what to do.
  2. Release the Shift key when you see the login window. If your startup disk is encrypted with FileVault, you might be asked to log in twice: once to unlock the startup disk, and again to log in to the Finder.
To leave safe mode, restart your Mac without pressing any keys during startup.

For option 2, reinstalling an app which came with macOS by using reinstall macOS option in recovery mode does not erase user information. So, hopefully, those valuable personal information like photos and videos could be retained this way. Of course, it's absolutely important to have a full backup at first instance.





disk0s2: I/O error again - Seriously?

My MacBook white cannot boot up and shut down during boot up process. I decided to handle it in Single Mode using boot up keys Command + S.

After a couple rounds of fsck checkups, it still showed up with error like "disk0s2: I/O error".

It was hopeless that fsck didn't try the best to repair the problems. I felt terrified when the other users suggests things like Reinstall Mac OS or buying a big external drive to backup what I have on the hard drive which was suspected to be damaged.

One post with a success gave me hope while I bumped into a post which suggested that fsck can force to repair errors with special options.

Pressing Command + S during boot up to enter Single mode and try the following command:

$ /sbin/fsck_hfs -dryf /dev/disk0s2

The device name disk0s2 may be varying depending on your MacBook model and configurations. As discussed on the forum, this command may need to run multiple times to have a successful repair. 

After a series of 'orphaned file hard link', 'Missing thread record', 'Invalid directory count' messages, messages like '*****The volume was modified ****' and 'repaired successfully' was finally shown up on the screen. It's now ready to reboot the machine to see if the system boots properly.

$ reboot now

Here's the reference of fsck_hfs command:





fsck_hfs
File System check for HFS and HPFS+ (high performance file systems
fsck_hfs -q [-df] special ...# check if clean unmount
fsck_hfs -p [-df] special ... # check for inconsistencies only

fsck_hfs [-n | -y | -r] [-dfgxlES] [-D flags] [-b size] [-B path] [-m mode] [-c size] [-R flags] /dev/disknsp …

# repair inconsistencies

Example:

       > sudo /sbin/fsck_hfs    -d -D0x33 /dev/disk0s10
      journal_replay(/dev/disk0s10) returned 0
      ** /dev/rdisk0s10
          Using cacheBlockSize=32K cacheTotalBlock=32768 cacheSize=1048576K.
         Executing fsck_hfs (version hfs-305.10.1).
      ** Checking non-journaled HFS Plus Volume.
         The volume name is untitled
      ** Checking extents overflow file.
      ** Checking catalog file.
      ** Checking multi-linked files.
      ** Checking catalog hierarchy.
      ** Checking extended attributes file.
      ** Checking volume bitmap.
      ** Checking volume information.
      ** The volume untitled appears to be OK.
          CheckHFS returned 0, fsmodified = 0
Preens the specified file systems. Started by fsck(8) from /etc/rc.boot during boot
Fix common inconsistencies for file systems that were not unmounted cleanly. If more serious problems are found, fsck_hfs does not try to fix them, indicates that it was not successful, and exits.

With no options check and attempt to fix the specified file systems.

-d debugging information.
-D flags extra debugging information. 
      0x0001 Informational
      0x0002 Error 
      0x0010 Extended attributes 
      0x0020 Overlapped extents 
      0x0033 include all
-b bytes size of the physical blocks used by -B
-B path Output the files containing the physical blocks listed in the file path. 
The file contains decimal, octal (with leading 0) or hexadecimal (with leading 0x) physical block numbers, separated by white space, relative to the start of the partition, For block numbers relative to the start of the device, subtract the block number of the start of the partition. 
The size of a physical block is given with the -b option; the default is 512 bytes per block.
-f with -p force check of `clean' file systems, 
otherwise force check and repair journaled HFS+ file systems.
-g generate output strings in GUI format. This option is used when another application with a graphical user interface (like Mac OS X Disk Utility) is invoking the fsck_hfs tool.
-x generate output strings in XML (plist) format. implies -g
-l Lock down the file system (not limit parallel check as in other versions of fsck> and perform a test-only check. This makes it possible to check a file system that is currently mounted, although no repairs can be made.
-m rwxrwxrwx permissions for the lost+found directory if it is created (suggest 700 ed). orphaned files and directories are moved to the lost+found directory (located at the root of the volume). The default mode is 01777.(bad)!
-c size size of the cache used by fsck_hfs internally. Bigger size can result in better performance but can result in deadlock when used with -l. Decimal, octal, or hexadecimal number. 
If the number ends with a k,m or g
-p Preen the specified file systems.
-q Causes fsck_hfs to quickly check whether the volume was unmounted cleanly. If the volume was unmounted cleanly, then the exit status is 0. If the volume was not unmounted cleanly, then the exit status will be non-zero. In either case, a message is printed to standard output describing whether the volume was clean or dirty.
-y Always attempt to repair any damage that is found.
-n Never
-E exit (with a value of 47) if it encounters any major errors. A ``major error'' is considered one which would impact using the volume in normal usage; an incon- sistency which would not impact such use is considered ``minor'' for this option. Only valid with the -n option.
-S scan the entire device looking for I/O errors. It will attempt to map the blocks with errors to names, similar to the -B option.
-R flags Rebuilds the requested btree. The following flags are supported: a Attribute btree
c Catalog btree
e Extents overflow btree
Requires free space on the file system for the new btree file, and if fsck_hfs is able to traverse each of the nodes in the requested btree successfully. Rebuilding btrees is not supported on HFS Standard volumes.
-r Rebuild the catalog btree. This is synonymous with -Rc. Because of inconsistencies between the block device and the buffer cache, the raw device should always be used.
Example:
 > sudo fsck_hfs -l -d -D 0x0033 -B ~/.profile /dev/disk0s8
0 blocks to match:
** /dev/rdisk0s8 (NO WRITE)
    Using cacheBlockSize=32K cacheTotalBlock=32768 cacheSize=1048576K.
   Executing fsck_hfs (version hfs-305.10.1).
** Performing live verification.
** Checking Journaled HFS Plus volume.
   The volume name is DATA
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
   Orphaned open unlinked file temp7479645
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
    invalid VHB attributesFile.clumpSize 
   Volume header needs minor repair
(2, 0)
   Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000
                  CBTStat = 0x0000 CatStat = 0x00000000
   Volume header needs minor repair
(2, 0)
   Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000
                  CBTStat = 0x0000 CatStat = 0x00000000
** The volume DATA was found corrupt and needs to be repaired.
    volume type is pure HFS+ 
    primary MDB is at block 0 0x00 
    alternate MDB is at block 0 0x00 
    primary VHB is at block 2 0x02 
    alternate VHB is at block 88769454 0x54a83ae 
   Volume header needs minor repair
(2, 0)
   Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000
                  CBTStat = 0x0000 CatStat = 0x00000000
** The volume DATA was found corrupt and needs to be repaired.
    volume type is pure HFS+ 
    primary MDB is at block 0 0x00 
    alternate MDB is at block 0 0x00 
    primary VHB is at block 2 0x02 
    alternate VHB is at block 88769454 0x54a83ae 
    sector size = 512 0x200 
    VolumeObject flags = 0x07 
    total sectors for volume = 88769456 0x54a83b0 
    total sectors for embedded volume = 0 0x00 
    CheckHFS returned 7, fsmodified = 0
 > echo $?
8
EXIT VALUES
0 No errors found, or successfully repaired.
3 A quick-check (the -n option) found a dirty filesystem; no repairs were made.
4 During boot, the root filesystem was found to be dirty; repairs were made, and the filesystem was remounted. The system should be rebooted.
8 A corrupt filesystem was found during a check, or repairs did not succeed.
47 A major error was found with -E.

Here's the LDAP Servers and where to find them

It comes in handy to search for heaps of unknown LDAP servers in local domain via nslookup.

Here's the command to find them on Windows/Linux:

#
# Windows 
C:\> 
C:\> nslookup -type-srv _ldap._tcp.dc._msdcs.[DOMAIN_NAME]
#
# Linux
$
$ nslookup -type-srv _ldap._tcp.dc._msdcs.[DOMAIN_NAME]


whereas [DOMAIN_NAME] is the real name of the local domain



Leap year handling in Python

According to Wikipedia, A leap year (also known as an intercalary year or bissextile year) is a calendar year containing one additional day (or, in the case of lunisolar calendars, a month) added to keep the calendar year synchronized with the astronomical or seasonal year. Because seasons and astronomical events do not repeat in a whole number of days, calendars that have the same number of days in each year drift over time with respect to the event that the year is supposed to track. By inserting (also called intercalating) an additional day or month into the year, the drift can be corrected. A year that is not a leap year is called a common year.

In other words, a normal year has 365 days while a Leap Year has 366 days (the extra day is the 29th of February).

Having been discussing with a few scientists about datetime handling in programming language, we do have a question if the concept of leap year and second is really applied to common programming languages,  especially Python. I have been satisfied with Python's strength and capability, just not so sure about how well it deals with the calendar issue like leap year and second. To illustrate whether Python can handle leap year correctly, here's the examples:

In [2]: import datetime

In [3]: datetime.datetime(2011, 2, 28) + datetime.timedelta(days=10)
Out[3]: datetime.datetime(2011, 3, 10, 0, 0)

In [4]: datetime.datetime(2011, 2, 28) + datetime.timedelta(days=1)
Out[4]: datetime.datetime(2011, 3, 1, 0, 0)

In [5]: datetime.datetime(2012, 2, 28) + datetime.timedelta(days=1)
Out[5]: datetime.datetime(2012, 2, 29, 0, 0)

As a highlight, there were only 28 days in February in 2011 while there were 29 days in February in 2012. Sounds good.

So how about datetime object itself?

According to Python library manual, a datetime object is a single object containing all the information from a date object and a time object. Like a date object, datetime assumes the current Gregorian calendar extended in both directions; like a time object, datetime assumes there are exactly 3600*24 seconds in every day.

Okay, so what about Leap Seconds?

According to a post in September 2016 on Stackoverflow, leap seconds are occasionally manually scheduled. Currently, computer clocks have no facility to honour leap seconds; there is no standard to tell them up-front to insert one. Instead, computer clocks periodically re-synch their time keeping via the NTP protocol and adjust automatically after the leap second has been inserted.
Next, computer clocks usually report the time as seconds since the epoch. It'd be up to the datetime module to adjust its accounting when converting that second count to include leap seconds. It doesn't do this at present. time.time() will just report a time count based on the seconds-since-the-epoch.
So, nothing different will happen when the leap second is officially in effect, other than that your computer clock will be 1 second of for a little while.
The issues with datetime only cover representing a leap second timestamp, which it can't. It won't be asked to do so anyway.

Rest assured that Python handled leap year well in the past and hopefully will do it good enough for the ongoing future.


In [6]:  datetime.datetime(2049, 2, 28) + datetime.timedelta(days=1)
Out[6]: datetime.datetime(2049, 3, 1, 0, 0)

In [7]:  datetime.datetime(2050, 2, 28) + datetime.timedelta(days=1)
Out[7]: datetime.datetime(2050, 3, 1, 0, 0)

In [8]:  datetime.datetime(2051, 2, 28) + datetime.timedelta(days=1)
Out[8]: datetime.datetime(2051, 3, 1, 0, 0)

In [9]:  datetime.datetime(2052, 2, 28) + datetime.timedelta(days=1)
Out[9]: datetime.datetime(2052, 2, 29, 0, 0)

Pip install via local proxy server behind secured firewall

Just found that PIP install works so well at home but not in office. One of the restriction is the firewall security to prevent non-http traffic passing through during package installation. Here's a trick:

Changing from this:
$
$ pip install --upgrade git+git://github.com/XXXXX/YYYYY.git


to this:
$ # x.y.z.s is IP address
$ # port is port number
$ pip --proxy=x.y.z.s:port install --upgrade git+https://github.com/XXXXX/YYYYY.git

Most firewall won't block http/https traffic which is supposed to be categorized as web traffic.

Get Plotly offline working in Jupyter Lab

You might encounter blank plot image if you install and use plotly module for Jupyter Lab in the first place.

When plotting data using library like Plotly, you will be asked to create user account and login in order to use the online APIs. However, Plotly does provide an offline version for use. It takes a couple of steps to resolve this manually.

This solutions work on Windows platform, but may also work on Linux/MacOS platform. Mileage varies.

To make sure plotly offline working on Jupyter Lab, please try the followings:

Try install plotly extension for Jupyter Lab:
> jupyter labextension install @jupyterlab/plotly-extension

For details, please visit https://github.com/jupyterlab/jupyter-renderers

You might also encounter ETIMEOUT error while install labextension if your computer is behind a known proxy server. Here's how to resolve:
>
> npm config set http-proxy <proxy address: port>
> npm config set https-proxy <proxy address: port>


For the issue of Plotly chart output from big dataset, the key is to increase maximum rate for output stream on Jupyter Lab server.
Edit the following entry in configuration file C:\Users\%USERNAME%\.jupyter\jupyter_notebook_config.py:
c.NotebookApp.iopub_data_rate_limit = 1.0e10


Here's an Plotly offline sample code block tested to be running on Jupyter Lab v0.31.12:
from plotly import __version__
import plotly
from plotly.offline import init_notebook_mode, plot
from plotly.graph_objs import Scatter

init_notebook_mode()

print("plotly version:", __version__)
plotly.offline.iplot([Scatter(x=[1, 2, 3], y=[3, 1, 6])])


apt install through corporate proxy

Assuming proxy service like CNTLM is up and running on Ubuntu machine, one can use apt-get to install package with specifying http proxy inf...