Dhall format #
Since mass writes now supports XLSX, we need to supply a sheet
name within the key transform file. Where we can specify the
sheet name via the dataSheet
key:
let tft =
\(csvR : Natural) ->
\(csvC : Natural) ->
\(pId : Natural) ->
{ dataCell = {csvRow = csvR, csvCol = csvC}
, dataSheet = Some "Sheet 1"
, dataPid = pId
, timeCell = None { csvRow : Natural, csvCol : Natural }
, timeFormat = None Text
}
in
...
Finding relative cells by
Regex #
You can also lookup the value of a cell with regexes:
[
< Absolute :
{ dataCell : { csvRow : Natural, csvCol : Natural }
, dataSheet : Optional Text
, dataPid : Natural
, timeCell : Optional { csvRow : Natural, csvCol : Natural }
, timeFormat : Optional Text
}
| Relative :
{ origin : { regexesRow : List Text, regexesCol : List Text }
, relativeDataCell : { relativeRow : Integer, relativeCol : Integer }
, dataSheet : Optional Text
, dataPid : Natural
, relativeTimeCell :
Optional { relativeRow : Integer, relativeCol : Integer }
, timeFormat : Optional Text
}>.Relative {
relativeDataCell = {relativeRow = +0, relativeCol = +1}
, origin = { regexesRow = ["a*","c.*"], regexesCol = ["data"] }
, dataPid = 15034318
, relativeTimeCell = None { relativeRow : Integer, relativeCol : Integer }
, timeFormat = None Text
, dataSheet = None Text
}]
Regex #
The origin field have the type
{ regexesRow : List Text, regexesCol : List Text }
. The
program is taking every input regex in regexesRow
and it
tries to find which row match every regex. It does the same process with
columns using regexesCol
. Once it find both, it takes the
resulting cell and add the relative value contained in
relativeDataCell
and relativeTimeCell
to find
where the actual data is.
Example #
In the example below we have a table that looks like this:<table>
name | state | data | time |
---|---|---|---|
Tank1 | Michigan | 34 | 2021-12-14 |
Tank2 | Oklahoma | 21 | 2021-12-13 |
Well1 | Oklahoma | 14 | 2022-12-12 |
Well2 | Texas | 62 | 2021-12-11 |
We can set the origin row as below…
origin = { regexesRow = ["Tank.*","Oklahoma"], regexesCol = ["data"] }
relativeDataCell = {relativeRow = +0, relativeCol = +0}
relativeTimeCell = Some {relativeRow = +0, relativeCol = +1}
Here, the only row that match both Tank.*
and
Oklahoma
is the 3rd row
Tank2, Oklahoma, 21, 2021-12-13
and the only column that
match data
is the 3rd column. The origin cell is
{row : 3, column : 3}
, the cell with the value
21
. This means that every other data point can be calculated relative to this row.
To find the actual data, we add the origin cell to the
relativeDataCell
, which gives us the cell
{row : 3 + 0, column : 3 + 0}
, so the data is
21
To find the time cell, we add the origin cell to the
relativeTimeCell
, which gives us the cell
{row : 3 + 0, column : 3 + 1}
, so the time is
2021-12-13
Fields description
- origin Regexes to find the origin cell
- relativeDataCell Where is the data cell you’re looking for relative to
the cell found using regexes. Can use positive and negative numbers - dataSheet optional excel sheet
- dataPid parameter ID of the entry
- relativeTimeCell Where is the time cell you’re looking for relative to
the cell found using regexes. You can use positive and negative numbers. - timeFormat Time format of the time cell
You can also mix the relative cells and the absolute
cells in the same file using this format:
< Absolute :
{ dataCell : { csvRow : Natural, csvCol : Natural }
, dataSheet : Optional Text
, dataPid : Natural
, timeCell : Optional { csvRow : Natural, csvCol : Natural }
, timeFormat : Optional Text
}
| Relative :
{ origin : { regexesRow : List Text, regexesCol : List Text }
, relativeDataCell : { relativeRow : Integer, relativeCol : Integer }
, dataSheet : Optional Text
, dataPid : Natural
, relativeTimeCell :
Optional { relativeRow : Integer, relativeCol : Integer }
, timeFormat : Optional Text
}>.Absolute {dataCell = {csvRow = 3, csvCol = 2}
, dataPid = 15034318
, timeCell = Some { csvRow = 4, csvCol = 1 }
, timeFormat = Some "%Y-%m-%d %H:%M"
, dataSheet = None Text
}
Time Formatting Helpers
TimeZone
- %z timezone offset in the format ±HHMM
- %Ez timezone offset in the format ±HH:MM
- %Z timezone name (or else offset in the format ±HHMM)
- %EZ timezone name (or else offset in the format ±HH:MM)
Time of Day
- %R same as %H:%M
- %T same as %H:%M:%S
- %X as timeFmt locale (e.g. %H:%M:%S)
- %r as time12Fmt locale (e.g. %I:%M:%S %p)
- %P day-half of day from (amPm locale), converted to lowercase, am, pm
- %p day-half of day from (amPm locale), AM, PM
- %H hour of day (24-hour), 0-padded to two chars, 00 – 23
- %k hour of day (24-hour), space-padded to two chars, 0 – 23
- %I hour of day-half (12-hour), 0-padded to two chars, 01 – 12
- %l hour of day-half (12-hour), space-padded to two chars, 1 – 12
- %M minute of hour, 0-padded to two chars, 00 – 59
- %S second of minute (without decimal part), 0-padded to two chars, 00 – 60
- %q picosecond of second, 0-padded to twelve chars, 000000000000 – 999999999999.
- %Q decimal point and fraction of second, up to 12 second decimals, without trailing zeros. For a whole number of seconds, %Q omits the decimal point unless padding is specified.
UTC Time and Zoned Time
- %s number of whole seconds since the Unix epoch. For times before the Unix epoch, this is a negative number. Note that in %s.%q and %s%Q the decimals are positive, not negative. For example, 0.9 seconds before the Unix epoch is formatted as -1.1 with %s%Q.
Day Of The Week
- %u day of week number for Week Date format, 1 (= Monday) – 7 (= Sunday)
- %w day of week number, 0 (= Sunday) – 6 (= Saturday)
- %a day of week, short form (snd from wDays locale), Sun – Sat
- %A day of week, long form (fst from wDays locale), Sunday – Saturday
Month
- %Y year, no padding. Note %0Y and %_Y pad to four chars
- %y year of century, 0-padded to two chars, 00 – 99
- %C century, no padding. Note %0C and %_C pad to two chars
- %B month name, long form (fst from months locale), January – December
- %b month name, short form (snd from months locale), Jan – Dec
- %m month of year, 0-padded to two chars, 01 – 12
Day
- %D same as %m/%d/%y
- %F same as %Y-%m-%d
- %x as dateFmt locale (e.g. %m/%d/%y)
- %d day of month, 0-padded to two chars, 01 – 31
- %e day of month, space-padded to two chars, 1 – 31
- %j day of year, 0-padded to three chars, 001 – 366
- %f century for Week Date format, no padding. Note %0f and %_f pad to two chars
- %V week of year for Week Date format, 0-padded to two chars, 01 – 53
- %U week of year where weeks start on Sunday (as sundayStartWeek), 0-padded to two chars, 00 – 53
- %W week of year where weeks start on Monday (as mondayStartWeek), 0-padded to two chars, 00 – 53