Release Notes (pre 1.0)¶
Note
These release notes are for versions of ibis prior to 1.0. For 1.0 and later release notes see Release Notes.
v0.14.0 (August 23rd, 2018)¶
This release brings refactored, more composable core components and rule system to ibis. We also focused quite heavily on the BigQuery backend this release.
New Features¶
Allow keyword arguments in Node subclasses (#968)
Splat args into Node subclasses instead of requiring a list (#969)
Add support for
UNIONin the BigQuery backend (#1408, #1409)Support for writing UDFs in BigQuery (#1377). See the BigQuery UDF docs for more details.
Support for cross-project expressions in the BigQuery backend. (#1427, #1428)
Add
strftimeandto_timestampsupport for BigQuery (#1422, #1410)Require
google-cloud-bigquery >=1.0(#1424)Limited support for interval arithmetic in the pandas backend (#1407)
Support for subclassing
TableExpr(#1439)Fill out pandas backend operations (#1423)
Add common DDL APIs to the pandas backend (#1464)
Implement the
sqlmethod for BigQuery (#1463)Add
to_timestampfor BigQuery (#1455)Add the
mapdbackend (#1419)Implement range windows (#1349)
Support for map types in the pandas backend (#1498)
Add
meanandsumforbooleantypes in BigQuery (#1516)All recent versions of SQLAlchemy are now suppported (#1384)
Add support for
NUMERICtypes in the BigQuery backend (#1534)Speed up grouped and rolling operations in the pandas backend (#1549)
Implement
TimestampNowfor BigQuery and pandas (#1575)
Bug Fixes¶
Nullable property is now propagated through value types (#1289)
Implicit casting between signed and unsigned integers checks boundaries
Fix precedence of case statement (#1412)
Fix handling of large timestamps (#1440)
Fix
identical_toprecedence (#1458)Pandas 0.23 compatibility (#1458)
Preserve timezones in timestamp-typed literals (#1459)
Fix incorrect topological ordering of
UNIONexpressions (#1501)Fix projection fusion bug when attempting to fuse columns of the same name (#1496)
Fix output type for some decimal operations (#1541)
API Changes¶
The previous, private rules API has been rewritten (#1366)
Defining input arguments for operations happens in a more readable fashion instead of the previous input_type list.
Removed support for async query execution (only Impala supported)
Remove support for Python 3.4 (#1326)
BigQuery division defaults to using
IEEE_DIVIDE(#1390)Add
toleranceparameter toasof_join(#1443)
v0.13.0 (March 30, 2018)¶
This release brings new backends, including support for executing against files, MySQL, Pandas user defined scalar and aggregations along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
New Backends¶
New Features¶
Support for Unsigned Integer Types (#1194)
Support for Interval types and expressions with support for execution on the Impala and Clickhouse backends (#1243)
Isnan, isinf operations for float and double values (#1261)
Support for an interval with a quarter period (#1259)
ibis.pandas.from_dataframeconvenience function (#1155)Remove the restriction on
ROW_NUMBER()requiring it to have anORDER BYclause (#1371)Add
.get()operation on a Map type (#1376)Allow visualization of custom defined expressions
Add experimental support for pandas UDFs/UDAFs (#1277)
Generalize the use of the
whereparameter to reduction operations (#1220)Support for interval operations thanks to @kszucs (#1243, #1260, #1249)
Support for the
PARTITIONTIMEcolumn in the BigQuery backend (#1322)Add
arbitrary()method for selecting the first non null value in a column (#1230, #1309)Windowed
MultiQuantileoperation in the pandas backend thanks to @DiegoAlbertoTorres (#1343)Rules for validating table expressions thanks to @DiegoAlbertoTorres (#1298)
Complete end-to-end testing framework for all supported backends (#1256)
contains/not containsnow supported in the pandas backend (#1210, #1211)CI builds are now reproducible locally thanks to @kszucs (#1121, #1237, #1255, #1311)
isnan/isinfoperations thanks to @kszucs (#1261)Framework for generalized dtype and schema inference, and implicit casting thanks to @kszucs (#1221, #1269)
Generic utilities for expression traversal thanks to @kszucs (#1336)
Design documentation for ibis (#1351)
API Changes¶
Fixing #1378 required the removal of the
nameparameter to theparam()function. Use thename()method instead.
v0.12.0 (October 28, 2017)¶
This release brings Clickhouse and BigQuery SQL support along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
New Backends¶
New Features¶
Add support for
Binarydata type (#1183)Allow users of the BigQuery client to define their own API proxy classes (#1188)
Add support for HAVING in the pandas backend (#1182)
Add struct field tab completion (#1178)
Add expressions for Map/Struct types and columns (#1166)
Support Table.asof_join (#1162)
Allow right side of arithmetic operations to take over (#1150)
Add a data_preload step in pandas backend (#1142)
expressions in join predicates in the pandas backend (#1138)
Scalar parameters (#1075)
Limited window function support for pandas (#1083)
Implement Time datatype (#1105)
Implement array ops for pandas (#1100)
support for passing multiple quantiles in
.quantile()(#1094)support for clip and quantile ops on DoubleColumns (#1090)
Enable unary math operations for pandas, sqlite (#1071)
Enable casting from strings to temporal types (#1076)
Allow selection of whole tables in pandas joins (#1072)
Implement comparison for string vs date and timestamp types (#1065)
Implement isnull and notnull for pandas (#1066)
Allow like operation to accept a list of conditions to match (#1061)
Add a pre_execute step in pandas backend (#1189)
Bug Fixes¶
Remove global expression caching to ensure repeatable code generation (#1179, #1181)
Ensure that
DataTypeand subclasses hash properly (#1172)Ensure that the pandas backend can deal with unary operations in groupby
(#1182)
Incorrect impala code generated for NOT with complex argument (#1176)
BUG/CLN: Fix predicates on Selections on Joins (#1149)
Don’t use SET LOCAL to allow redshift to work (#1163)
Allow empty arrays as arguments (#1154)
Fix column renaming in groupby keys (#1151)
Ensure that we only cast if timezone is not None (#1147)
Fix location of conftest.py (#1107)
TST/Make sure we drop tables during postgres testing (#1101)
Fix misleading join error message (#1086)
BUG/TST: Make hdfs an optional dependency (#1082)
Memoization should include expression name where available (#1080)
Performance Enhancements¶
Contributors¶
The following people contributed to the 0.12.0 release
$ git shortlog -sn --no-merges v0.11.2..v0.12.0
63 Phillip Cloud
8 Jeff Reback
2 Krisztián Szűcs
2 Tory Haavik
1 Anirudh
1 Szucs Krisztian
1 dlovell
1 kwangin
0.11.0 (June 28, 2017)¶
This release brings initial Pandas backend support along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
New Features¶
Experimental pandas backend to allow execution of ibis expression against pandas DataFrames
Graphviz visualization of ibis expressions. Implements
_repr_png_for Jupyter Notebook functionalityAbility to create a partitioned table from an ibis expression
Support for missing operations in the SQLite backend: sqrt, power, variance, and standard deviation, regular expression functions, and missing power support for PostgreSQL
Support for schemas inside databases with the PostgreSQL backend
Appveyor testing on core ibis across all supported Python versions
Add
year/month/daymethods todatetypesAbility to sort, group by and project columns according to positional index rather than only by name
Added a
typeparameter toibis.literalto allow user specification of literal types
Bug Fixes¶
Fix broken conda recipe
Fix incorrectly typed fillna operation
Fix postgres boolean summary operations
Fix kudu support to reflect client API Changes
Fix equality of nested types and construction of nested types when the value type is specified as a string
API Changes¶
Deprecate passing integer values to the
ibis.timestampliteral constructor, this will be removed in 0.12.0Added the
admin_timeoutparameter to the kudu clientconnectfunction
Contributors¶
$ git shortlog --summary --numbered v0.10.0..v0.11.0
58 Phillip Cloud
1 Greg Rahn
1 Marius van Niekerk
1 Tarun Gogineni
1 Wes McKinney
0.8 (May 19, 2016)¶
This release brings initial PostgreSQL backend support along with a number of critical bug fixes and usability improvements. As several correctness bugs with the SQL compiler were fixed, we recommend that all users upgrade from earlier versions of Ibis.
New Features¶
Initial PostgreSQL backend contributed by Phillip Cloud.
Add
groupbyas an alias forgroup_byto table expressions
Bug Fixes¶
Fix an expression error when filtering based on a new field
Fix Impala’s SQL compilation of using
ORwith compound filtersVarious fixes with the
having(...)function in grouped table expressionsFix CTE (
WITH) extraction insideUNION ALLexpressions.Fix
ImportErroron Python 2 whenmocklibrary not installed
API Changes¶
The deprecated
ibis.impala_connectandibis.make_clientAPIs have been removed
0.7 (March 16, 2016)¶
This release brings initial Kudu-Impala integration and improved Impala and SQLite support, along with several critical bug fixes.
New Features¶
Apache Kudu (incubating) integration for Impala users. See the blog post for now. Will add some documentation here when possible.
Add
use_httpsoption toibis.hdfs_connectfor WebHDFS connections in secure (Kerberized) clusters without SSL enabled.Correctly compile aggregate expressions involving multiple subqueries.
To explain this last point in more detail, suppose you had:
table = ibis.table([('flag', 'string'),
('value', 'double')],
'tbl')
flagged = table[table.flag == '1']
unflagged = table[table.flag == '0']
fv = flagged.value
uv = unflagged.value
expr = (fv.mean() / fv.sum()) - (uv.mean() / uv.sum())
The last expression now generates the correct Impala or SQLite SQL:
SELECT t0.`tmp` - t1.`tmp` AS `tmp`
FROM (
SELECT avg(`value`) / sum(`value`) AS `tmp`
FROM tbl
WHERE `flag` = '1'
) t0
CROSS JOIN (
SELECT avg(`value`) / sum(`value`) AS `tmp`
FROM tbl
WHERE `flag` = '0'
) t1
Bug Fixes¶
CHAR(n)andVARCHAR(n)Impala types now correctly map to Ibis string expressionsFix inappropriate projection-join-filter expression rewrites resulting in incorrect generated SQL.
ImpalaClient.create_tablecorrectly passesSTORED AS PARQUETforformat='parquet'.Fixed several issues with Ibis dependencies (impyla, thriftpy, sasl, thrift_sasl), especially for secure clusters. Upgrading will pull in these new dependencies.
Do not fail in
ibis.impala.connectwhen trying to create the temporary Ibis database if no HDFS connection passed.Fix join predicate evaluation bug when column names overlap with table attributes.
Fix handling of fully-materialized joins (aka
select *joins) in SQLAlchemy / SQLite.
Contributors¶
Thank you to all who contributed patches to this release.
$ git log v0.6.0..v0.7.0 --pretty=format:%aN | sort | uniq -c | sort -rn
21 Wes McKinney
1 Uri Laserson
1 Kristopher Overholt
0.6 (December 1, 2015)¶
This release brings expanded pandas and Impala integration, including support for managing partitioned tables in Impala. See the new Ibis for Impala Users guide for more on using Ibis with Impala.
The Ibis for SQL Programmers guide also was written since the 0.5 release.
This release also includes bug fixes affecting generated SQL correctness. All users should upgrade as soon as possible.
New Features¶
New integrated Impala functionality. See Ibis for Impala Users for more details on these things.
Improved Impala-pandas integration. Create tables or insert into existing tables from pandas
DataFrameobjects.Partitioned table metadata management API. Add, drop, alter, and insert into table partitions.
Add
is_partitionedproperty toImpalaTable.Added support for
LOAD DATADDL using theload_datafunction, also supporting partitioned tables.Modify table metadata (location, format, SerDe properties etc.) using
ImpalaTable.alterInterrupting Impala expression execution with Control-C will attempt to cancel the running query with the server.
Set the compression codec (e.g. snappy) used with
ImpalaClient.set_compression_codec.Get and set query options for a client session with
ImpalaClient.get_optionsandImpalaClient.set_options.Add
ImpalaTable.metadatamethod that parses the output of theDESCRIBE FORMATTEDDDL to simplify table metadata inspection.Add
ImpalaTable.statsandImpalaTable.column_statsto see computed table and partition statistics.Add
CHARandVARCHARhandlingAdd
refresh,invalidate_metadataDDL options and addincrementaloption tocompute_statsforCOMPUTE INCREMENTAL STATS.
Add
substitutemethod for performing multiple value substitutions in an array or scalar expression.Division is by default true division like Python 3 for all numeric data. This means for SQL systems that use C-style division semantics, the appropriate
CASTwill be automatically inserted in the generated SQL.Easier joins on tables with overlapping column names. See Ibis for SQL Programmers.
Expressions like
string_expr[:3]now work as expected.Add
coalesceinstance method to all value expressions.Passing
limit=Noneto theexecutemethod on expressions disables any default row limits.
API Changes¶
ImpalaTable.renameno longer mutates the calling table expression.
Contributors¶
$ git log v0.5.0..v0.6.0 --pretty=format:%aN | sort | uniq -c | sort -rn
46 Wes McKinney
3 Uri Laserson
1 Phillip Cloud
1 mariusvniekerk
1 Kristopher Overholt
0.5 (September 10, 2015)¶
Highlights in this release are the SQLite, Python 3, Impala UDA support, and an asynchronous execution API. There are also many usability improvements, bug fixes, and other new features.
New Features¶
SQLite client and built-in function support
Ibis now supports Python 3.4 as well as 2.6 and 2.7
Ibis can utilize Impala user-defined aggregate (UDA) functions
SQLAlchemy-based translation toolchain to enable more SQL engines having SQLAlchemy dialects to be supported
Many window function usability improvements (nested analytic functions and deferred binding conveniences)
More convenient aggregation with keyword arguments in
aggregatefunctionsBuilt preliminary wrapper API for MADLib-on-Impala
Add
varandstdaggregation methods and support in ImpalaAdd
nullifzeronumeric method for all SQL enginesAdd
renamemethod to Impala tables (for renaming tables in the Hive metastore)Add
closemethod toImpalaClientfor session cleanup (#533)Add
relabelmethod to table expressionsAdd
insertmethod to Impala tablesAdd
compileandverifymethods to all expressions to test compilation and ability to compile (since many operations are unavailable in SQLite, for example)
API Changes¶
Impala Ibis client creation now uses only
ibis.impala.connect, andibis.make_clienthas been deprecated
Contributors¶
$ git log v0.4.0..v0.5.0 --pretty=format:%aN | sort | uniq -c | sort -rn
55 Wes McKinney
9 Uri Laserson
1 Kristopher Overholt
0.4 (August 14, 2015)¶
New Features¶
Add tooling to use Impala C++ scalar UDFs within Ibis (#262, #195)
Support and testing for Kerberos-enabled secure HDFS clusters
Many table functions can now accept functions as parameters (invoked on the calling table) to enhance composability and emulate late-binding semantics of languages (like R) that have non-standard evaluation (#460)
Add
any,all,notany, andnotallreductions on boolean arrays, as well ascumanyandcumallUsing
topknow produces an analytic expression that is executable (as an aggregation) but can also be used as a filter as before (#392, #91)Added experimental database object “usability layer”, see
ImpalaClient.database.Add
TableExpr.infoAdd
compute_statsAPI to table expressions referencing physical Impala tablesAdd
explainmethod toImpalaClientto show query plan for an expressionAdd
chmodandchownAPIs toHDFSinterface for superusersAdd
convert_basemethod to strings and integer typesAdd option to
ImpalaClient.create_tableto create empty partitioned tablesibis.cross_joincan now join more than 2 tables at onceAdd
ImpalaClient.raw_sqlmethod for running naked SQL queriesImpalaClient.insertnow validates schemas locally prior to sending query to cluster, for better usability.Add conda installation recipes
Contributors¶
$ git log v0.3.0..v0.4.0 --pretty=format:%aN | sort | uniq -c | sort -rn
38 Wes McKinney
9 Uri Laserson
2 Meghana Vuyyuru
2 Kristopher Overholt
1 Marius van Niekerk
0.3 (July 20, 2015)¶
First public release. See http://ibis-project.org for more.
New Features¶
Implement window / analytic function support
Enable non-equijoins (join clauses with operations other than
==).Add remaining string functions supported by Impala.
Add
pipemethod to tables (hat-tip to the pandas dev team).Add
mutateconvenience method to tables.Fleshed out
WebHDFSimplementations: get/put directories, move files, etc. See the full HDFS API.Add
truncatemethod for timestamp valuesImpalaClientcan execute scalar expressions not involving any table.Can also create internal Impala tables with a specific HDFS path.
Make Ibis’s temporary Impala database and HDFS paths configurable (see
ibis.options).Add
truncate_tablefunction to client (if the user’s Impala cluster supports it).Python 2.6 compatibility
Enable Ibis to execute concurrent queries in multithreaded applications (earlier versions were not thread-safe).
Test data load script in
scripts/load_test_data.pyAdd an internal operation type signature API to enhance developer productivity.
Contributors¶
$ git log v0.2.0..v0.3.0 --pretty=format:%aN | sort | uniq -c | sort -rn
59 Wes McKinney
29 Uri Laserson
4 Isaac Hodes
2 Meghana Vuyyuru
0.2 (June 16, 2015)¶
New Features¶
insertmethod on Ibis client for inserting data into existing tables.parquet_file,delimited_file, andavro_fileclient methods for querying datasets not yet available in ImpalaNew
ibis.hdfs_connectmethod andHDFSclient API for WebHDFS for writing files and directories to HDFSNew timedelta API and improved timestamp data support
New
bucketandhistogrammethods on numeric expressionsNew
categorylogical datatype for handling bucketed data, among other thingsAdd
summaryAPI to numeric expressionsAdd
value_countsconvenience API to array expressionsNew string methods
like,rlike, andcontainsfor fuzzy and regex searchingAdd
options.verboseoption and configurableoptions.verbose_logcallback function for improved query logging and visibilitySupport for new SQL built-in functions
ibis.coalesceibis.greatestandibis.leastibis.wherefor conditional logic (see alsoibis.caseandibis.cases)nullifmethod on value expressionsibis.now
New aggregate functions:
approx_median,approx_nunique, andgroup_concatwhereargument in aggregate functionsAdd
havingmethod togroup_byintermediate objectAdded group-by convenience
table.group_by(exprs).COLUMN_NAME.agg_function()Add default expression names to most aggregate functions
New Impala database client helper methods
create_databasedrop_databaseexists_databaselist_databasesset_database
Client
list_tablessearching / listing methodAdd
add,sub, and other explicit arithmetic methods to value expressions
API Changes¶
New Ibis client and Impala connection workflow. Client now combined from an Impala connection and an optional HDFS connection
Bug Fixes¶
Numerous expression API bug fixes and rough edges fixed
Contributors¶
$ git log v0.1.0..v0.2.0 --pretty=format:%aN | sort | uniq -c | sort -rn
71 Wes McKinney
1 Juliet Hougland
1 Isaac Hodes
0.1 (March 26, 2015)¶
First Ibis release.
Expression DSL design and type system
Expression to ImpalaSQL compiler toolchain
Impala built-in function wrappers
$ git log 84d0435..v0.1.0 --pretty=format:%aN | sort | uniq -c | sort -rn
78 Wes McKinney
1 srus
1 Henry Robinson