Every bit counts: The binary representation of typed data and programs

ANDREW J. KENNEDY; DIMITRIOS VYTINIOTIS

doi:10.1017/S0956796812000263

Every bit counts: The binary representation of typed data and programs

Part of: JFP Research Articles

Published online by Cambridge University Press: 15 August 2012

ANDREW J. KENNEDY and

DIMITRIOS VYTINIOTIS

Show author details

ANDREW J. KENNEDY: Affiliation:
Microsoft Research, Cambridge, CB3 0FB, UK (e-mail: akenn@microsoft.com, dimitris@microsoft.com)
DIMITRIOS VYTINIOTIS: Affiliation:
Microsoft Research, Cambridge, CB3 0FB, UK (e-mail: akenn@microsoft.com, dimitris@microsoft.com)

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We show how the binary encoding and decoding of typed data and typed programs can be understood, programmed and verified with the help of question–answer games. The encoding of a value is determined by the yes/no answers to a sequence of questions about that value; conversely, decoding is the interpretation of binary data as answers to the same question scheme. We introduce a general framework for writing and verifying game-based codecs. We present games in Haskell for structured, recursive, polymorphic and indexed types, building up to a representation of well-typed terms in the simply-typed λ-calculus with polymorphic constants. The framework makes novel use of isomorphisms between types in the definition of games. The definition of isomorphisms together with additional simple properties make it easy to prove that codecs derived from games never encode two distinct values using the same code, never decode two codes to the same value and interpret any bit sequence as a valid code for a value or as a prefix of a valid code. Formal properties of the framework have been proved using the Coq proof assistant.

Type: Articles
Information: Journal of Functional Programming , Volume 22 , Special Issue 4-5: ICFP 2010 , September 2012 , pp. 529 - 573

DOI: https://doi.org/10.1017/S0956796812000263 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2012

References

Bertot, Y. & Casteran, P. (2004) Interactive Theorem Proving and Program Development. Springer-Verlag.CrossRef Google Scholar

Bird, R. & Gibbons, J. (2003) Arithmetic coding with folds and unfolds. In Advanced Functional Programming 4, Jeuring, J. & Peyton Jones, S. (eds), Lecture Notes in Computer Science, vol. 2638. Springer-Verlag, pp. 1–26.Google Scholar

Burtscher, M., Livshits, B., Sinha, G. & Zorn, B. (2010 June) JSZap: Compressing JavaScript code. In Proceedings of the USENIX Conference on Web Application Development. Berkeley, CA: USENIX Association.Google Scholar

Cameron, R. D. (1988) Source encoding using syntactic information source models. IEEE Trans. Inf. Theory 34 (4), 843–850.CrossRef Google Scholar

Cheney, J. (2000) Statistical models for term compression. In DCC '00: Proceedings of the Conference on Data Compression. Washington, DC: IEEE Computer Society, p. 550.Google Scholar

Claessen, K. & Hughes, J. (2000) Quickcheck: A lightweight tool for random testing of Haskell programs. In ICFP '00: Proceedings of the 5th ACM SIGPLAN International Conference on Functional Programming. New York: ACM, pp. 268–279.CrossRef Google Scholar

Contla, J. F. (1985) Compact coding of syntactically correct source programs. Softw. Pract. Exper. 15, 625–636.CrossRef Google Scholar

Coutts, D., Leshchinskiy, R. & Stewart, D. (2007) Stream fusion: From lists to streams to nothing at all. In ICFP '07: Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming. New York: ACM, pp. 315–326.CrossRef Google Scholar

Duan, J., Hurd, J., Li, G., Owens, S., Slind, K. & Zhang, J. (2005) Functional correctness proofs of encryption algorithms. In Logic for Programming, Artificial Intelligence and Reasoning (LPAR), LNCS, vol. 3835. Springer, pp. 519–533.CrossRef Google Scholar

ECMA. (2006) Standard ECMA-335: Common Language Infrastructure (CLI). Geneva, Switzerland: ECMA International.Google Scholar

Elias, P. (1975) Universal codeword sets and representations of the integers. IEEE Trans. Inf. Theory 21 (2), 197–203.CrossRef Google Scholar

Fisher, K., Mandelbaum, Y. & Walker, D. (2006) The next 700 data description languages. SIGPLAN Not. 41 (1), 2–15.CrossRef Google Scholar

Franz, M., Haldar, V., Krintz, C. & Stork, C. H. (2002) Tamper-Proof Annotations by Construction. Tech. Rep. 02-10. Department of Information and Computer Science, University of California, Irvine.Google Scholar

Ghani, N., Hancock, P. & Pattinson, D. (2009) Representations of stream processors using nested fixed points. Logical Methods Comput. Sci. 5 (3), 1–17.Google Scholar

Gibbons, J. (2007) Datatype-generic programming. In Datatype-Generic Programming, Backhouse, R., Gibbons, J., Hinze, R. & Jeuring, J. (eds), LNCS, vol. 4719. Berlin, Heidelberg: Springer, pp. 1–71.Google Scholar

Gonthier, G., Mahboubi, A. & Tassi, E. (2011) A Small Scale Reflection Extension for the Coq System. Tech. Rep. 6455. INRIA.Google Scholar

Haldar, V., Stork, C. H. & Franz, M. (2002) The source is the proof. In NSPW '02: Proceedings of the 2002 Workshop on New Security Paradigms. New York: ACM, pp. 69–73.CrossRef Google Scholar

Hinze, R., Jeuring, J. & Löh, A. (2006) Comparing approaches to generic programming in Haskell. Spring Sch. Datatype-Generic Program, LNCS, vol. 4719, pp. 72–149.Google Scholar

Holdermans, S., Jeuring, J., Löh, A. & Rodriguez, A. (2006) Generic views on data types. In Proceedings of the 8th International Conference on Mathematics of Program Construction, MPC06, volume 4014 of LNCS. Springer, pp. 209–234.Google Scholar

Kennedy, A. J. (2004) Functional pearl: Pickler combinators. J. Funct. Program. 14 (6), 727–739.CrossRef Google Scholar

Knuth, D. E. (1992) Axioms and Hulls, LNCS, vol. 606. Springer-Verlag.CrossRef Google Scholar

MacKay, D. J. C. (2003) Information Theory, Inference and Learning Algorithms. Cambridge University Press.Google Scholar

Necula, G. C. & Lee, P. (1998) The design and implementation of a certifying compiler. In PLDI '98: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation. New York: ACM, pp. 333–344.CrossRef Google Scholar

Necula, G. C. & Rahul, S. P. (2001) Oracle-based checking of untrusted software. In POPL'01: Proceedings of the 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. New York: ACM, pp. 142–154.CrossRef Google Scholar

Nielsen, L. & Henglein, F. (2011) Bit-coded regular expression parsing. In Proceedings of the 5th Int'l Conference on Language and Automata Theory and Applications (LATA), LNCS, vol. 6638. Springer, pp. 402–413.CrossRef Google Scholar

Palka, M. H., Claessen, K., Russo, A. & Hughes, J. (2011) Testing an optimising compiler by generating random lambda terms. In Proceedings of the 6th International Workshop on Automation of Software Test (AST), AST '11. New York: ACM, pp. 91–97.CrossRef Google Scholar

Rendel, T. & Ostermann, K. (2010) Invertible syntax descriptions: unifying parsing and pretty printing. SIGPLAN Not. 45, 1–12.CrossRef Google Scholar

Salomon, D. (2008) A Concise Introduction to Data Compression, Undergraduate Topics in Computer Science. Springer.CrossRef Google Scholar

Sørensen, M. H. & Urzyczyn, P. (2006) Lectures on the Curry-Howard Isomorphism (Studies in Logic and the Foundations of Mathematics, Volume 149). New York: Elsevier Science.Google Scholar

Sozeau, M. (2006) Subset coercions in Coq. In Selected Papers from the International Workshop on Types for Proofs and Programs (TYPES '06). Springer, pp. 237–252.Google Scholar

Sulzmann, M., Chakravarty, M. & Peyton Jones, S. (2007) System F with type equality coercions. In ACM SIGPLAN International Workshop on Types in Language Design and Implementation (TLDI). ACM, pp. 53–66.Google Scholar

Vytiniotis, D. & Kennedy, A. J. (2010) Functional pearl: Every bit counts. In ACM SIGPLAN International Conference on Functional Programming (ICFP). ACM, pp. 15–26.CrossRef Google Scholar

Yakushev, A. R. & Jeuring, J. (2009) Enumerating well-typed terms generically. In Proceedings of the 5th Int'l Conference on Approaches and Applications of Inductive Programming (AAIP), LNCS, vol. 5812. Springer, pp. 41–52.Google Scholar

Submit a response

Discussions

No Discussions have been published for this article.

Article contents

Every bit counts: The binary representation of typed data and programs

Abstract

References

Discussions

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests